ON THE AMOUNT OF DEPENDENCE IN THE PRIME 
FACTORIZATION OF A UNIFORM RANDOM INTEGER 



RICHARD ARRATIA 



Abstract. How much dependence is there in the prime factorization 
of a random integer distributed uniformly from 1 to n? How much de- 
pendence is there in the decomposition into cycles of a random permu- 
tation of n points? What is the relation between the Poisson-Dirichlet 
process and the scale invariant Poisson process? These three questions 
have essentially the same answers, with respect to total variation dis- 
tance, considering only small components, and with respect to a Wasser- 
stein distance, considering all components. The Wasserstein distance is 
the expected number of changes - insertions and deletions - needed to 
change the dependent system into an independent system. 

In particular we show that for primes, roughly speaking, 2 + o(l) 
changes are necessary and sufficient to convert a uniformly distributed 
random integer from 1 to n into a random integer n„< n p Zp in which the 
multiplicity Z p of the factor p is geometrically distributed, with all Z p 
independent. The changes are, with probability tending to 1, one dele- 
tion, together with a random number of insertions, having expectation 
l + o(l). 

The crucial tool for showing that 2 + e suffices is a coupling of the 
infinite independent model of prime multiplicities, with the scale invari- 
ant Poisson process on (0,oo). A corollary of this construction is the 
first metric bound on the distance to the Poisson-Dirichlet in Billings- 
ley's 1972 weak convergence result. Our bound takes the form: there 
are couplings in which 

E ^llogPi(n) - 0ogn)V5| = O(loglogn), 

where Pi denotes the i th largest prime factor and V% denotes the i th com- 
ponent of the Poisson-Dirichlet process. It is reasonable to conjecture 
that O(l) is achievable. 
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For the reader impatient to get to business, the main results are given 
by the bound (J72J) in Theorem [3l and the bound (|8ip in Theorem [5j A 
$500 conjecture, related to Theorem [3l is given by relation (p7|) . and a $100 
conjecture, related to Theorem [5l is given by the bound (fBUjh 

1. Lecture 1. Growing a random integer 

I would like to thank the Janos Bolyai Mathematical Society and the 
organizers of this conference for the honor of speaking here, where the spirit 
of Erdos seems so close. At the conference reception last night Imre Csiszar, 
student of Renyi, suggested that Erdos, now in heaven, has read the book 
where all best proofs are given. But I prefer to believe that in heaven, Erdos 
by choice does not ask to see the book; he only asks of a proof, "Is there 
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an even better one in the book?" And he watches us fellow mathematicians 
here on earth, as we lecture and write and discover; by not diving into 
the book, he can continue to compete against us. For the joy of discovery 
through one's own effort is far more rewarding than even reading the book. 
I would also like to thank my collaborators, Andrew D. Barbour and Simon 
Tavare, as their work and ideas pervade these lectures. I hope that our book 
on logarithmic combinatorial structures [5], in preparation since 1992, will 
soon see publication! 

The guiding question for today's lecture is to ask, by analogy with the 
Erdos-Renyi notion of growing a random graph, "can we grow a random 
integer?" For guidance, we compare with the simpler task of "growing" a 
random permutation. 

NOTATION: n is always the parameter, rather than the random object. 
We consider 

- a permutation chosen uniformly from S n 

- an integer N chosen uniformly from 1 to n. 

We may emphasize the role of the parameter n by writing it explicitly: 

P n (jV = i) = P(JV(n) = i) =l/n for i = 1,2,... ,n. 

For the prime factorization we write 

N(n) = l[p c ^ n \ so (C p (n)) p = (C 2 (n),C 3 (n),C 5 (n), . . .) 

is a dependent process, with C p (n) identically zero if p > n. 
The baby fact: as n — > oo 

(1) (C P (n)) p => (Z p ) p = (Z 2 , Z 3 , Z 5 , . . .) 

with independent, geometrically distributed coordinates, with F(Z p > k) = 

— h 
P . 

Proof. Given j distinct primes pi,...,Pj, and integers i±, . . . , ij > 0, write 
d = Pi ■ ■ ■ p % - so that as events, 

{C P1 >ii,..., C ft ,>ij} = { d\N }. 

Thus 

In 1 

(2) P n ( d\N) = -[-\ - = P(Z P1 > i u ■ ■ ■ , Z p . > ij), 
and j-dimensional differencing yields 

(3) P n (C pi = i\, — , C Pj = ij) — > ¥(Z Pl = ii, — , Z Pj = ij). 

U 

In fact, the approximation error in ([2]) is at most 1/ra, so the error in 
([3]) is at most 2 J jn — a crude upper bound which comes from taking the 
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absolute value inside. The sum over all integers d = Yl p< i ) p ap whose largest 
prime factor is at most b, i.e. 

(4) Yl I F ( C P ( n ) = « P Vp < 6) - ¥(Z P = a p Vp < b)\, 

d:P\d)<b 

gives the total variation distance dTv(b, n), restricting to primes not exceed- 
ing b, and surprisingly, the crude upper bound u(b,n), formed by taking 
absolute values inside, gives pretty good information. But the information 
is not nearly as good as the result of Kubilius; we will describe these bounds 
later, in sections 11.5.21 and 11.5.31 

Since K Z p = ^-j- with ^EZ P = oo, it follows from the independence of 
the Z p that 

1 = F (J^ Z p = oo) . 

Thus we call the multiset, having Z p copies of p for each prime, the nat- 
ural random infinite multiset of primes. The guiding question for today's 
lecture, can we grow random integers, is now stated more precisely as: can 
we construct iV(l), N(2), . . . , N(n), ... all on a single probability space, to- 
gether with the natural random infinite multiset of primes, so that the C p (n) 
evolve smoothly, with C p {n) — > Z p as n — > oo. The answer is yes in a 
sense; the construction culminating with (j75|) at the end of Lecture 3 has 
C p {n) — > Z p in probability, but with probability one, liminf C p (n) = Z p and 
lim sup C p (n) = Z p + 1. 

1.1. Overview: the limits for primes, small and large. This section 
presents the material from one of the two transparencies which were shown 
repeatedly throughout the lectures; the material from the other transparency 
appears in this writeup as Conjecture Q] in section 12.21 and as Conjecture [2] 
in section SJ 

N(n) is uniform 1 to n, with to focus on — 

N{n) = Y[p Gpi ^ — small factors, 

N(n) = P\(n)P2(n) ■ ■ ■ , Pi > P2 > ■ ■ ■ , prime or 1, — large factors 

LIMITS in distribution as n — > 00 

(C p (n)) p => (Z p )p, independent Geometric 

( lQ f Pl(ra) , l °f 2{n) , . . . W (V U V 2 , . . .), Poisson-Dirichlet 
\ logn logn J 

proved by Billingsley in 1972, where the limit is now known as the Poisson- 
Dirichlet process, with parameter 1. 
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HOW CLOSE? One coupling has 

(5) EE |C p(n) -Z p | <2 + (<&!!), 

(6) E^|IogPi(n)-(F0(logn)| = O(Ioglogn). 

We note that in ([5]), 2 — e is not possible, while in (|6|), 0(1) should be 
possible. The reason that loglogn appears in our bound for ([6]) is made 
clear by equation (163p in conjunction with the proof of Theorem [5j 

1.2. Sketch of the coupling to grow N(n). Take a size biased permuta- 
tion, say Qi, Q2, ■ ■ ■ , of the prime factors in 2 Z2 3 Za • • • . 

Let J(n) = the largest partial product Q1Q2 ■ ■ ■ Qj < n. Write L = L(n) 
for the number of factors, so that J(n) = Q1Q2 ■ ■ ■ Qi(n)- 

Show: J(n) has approximately the same distribution as H{n), where by 
definition H{n) has the harmonic distribution on [n], 

(7) P(H(n)=i)= — ^ = » = l,2,....n. 

Fill in one extra factor, i"b( n )> to be prime or one. Use a random uniform 
U € (0,1] to choose uniformly from the 1 + 7r(n/J(n)) possibilities with 
J(n)Po(n) < n. 

Show that the resulting random integer, JPq = J(n)PQ(n) is close to 
uniform. In fact, from Lemma [3] 



(8) d T v(J(n),H(n))=0 
and from Lemma U] 

(9) d TV (J(n)P (n),N(n)) = O 



1 



log n J ' 

log log n 
logn 



Finally, modify the coupling on the event whose probability is the left 
side of ([9]), so that the modified versions of JPq are exactly uniform, for all 
n. 

Noga Alon asked, "Does P(J(n)P (n) = 1) ~ 1/n? The total varia- 
tion distance bound ([9]) does not determine the answer. Since P(Po(n) = 
1 I J(n) = 1) = 1/(1 + vr(n)) ~ log n/n, Alon's question is equivalent to, 
"Does P(J(n) = 1) ~ 1/logn? Now if Z p = for all p < n then we must 
have J(n) = 1, and P(Z p = Mp < n) = UpKni 1 ~ 1 /p) ~ e~ 7 /logn 
by Mertens' Theorem. Thus Alon's question is equivalent to asking: does 
(1 — e _7 )/logn give the asymptotic probability that there are one or more 
primes less than or equal to n in the infinite multiset and, in the size biased 
permutation, some prime greater than n comes before all of them. The 
answer, which I did not give at the workshop, but was more or less evident 
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from (|1"8"|) and (i50|) . is yes; see Lemma [3] for further details. While this af- 
firmative answer might give us hope that J(n)Po( n ) is close to uniform in 
the sense that for all i < n, P(J(n)Po(n) = i) ~ 1/n, it is clearly not so. 
In fact, (for the coupling in Lecture 3, which is slightly different from the 
coupling described in this lecture,) for fixed i, having uj(i) distinct prime 
factors, as n — > oo, P(J(n)Po(n) = i) ~ (1 + w(i))/n; see (f59j) and (|60|) for 
details. Thus i = 1 is the only fixed integer having the correct asymptotic 
probability! 

How does J(n)Po(n) evolve as n grows? Write p& for the smallest prime 
larger than p, with 1# = 2. The following properties hold for all n > 1 and 
for all outcomes. 

J(n)P (n) € [1,4 
Each factor has its own smoothness: 

J{n)/ J(n — 1) is one or a prime, 

P (n) =P (n-l),P (n-l)*, or 1, 
and the jumps in the two factors are linked: 

J(n) ^ J(n — 1) implies ( J(n) = n and Po(n) = 1), 

Po(n) < Po( n — 1) implies ( J(n) = n and Po(n) = 1). 

1.3. Size biased permutations. Several in the audience requested clari- 
fication: what is a size biased permutation? Given k objects, with "sizes" 
or "weights" n, r2, . . . , > 0, we can carry out a size biased permuta- 
tion using k independent standard exponentially distributed "alarm clocks" 
S 1 ,S 2 ,...,S k with P(S l > t) = e~* for all t > 0. The labels W< := Si/n 
are exponentially distributed with P(Wj > t) = e~ Tlt , (we say "rate rj" or 
"mean 1/Vj,") and the labels are independent, with P(Wj = Wj) = for 
every i ^ j. The ranking of the labels induces a size biased permutation 
of the objects. Observe that P(W< is the smallest among Wi, V^2, • • • , Wfc) 
= r i/( r i + r 2 + • • ■ + r k), which is a familiar fact from the study of finite- 
state Markov chains in continuous time, where the rj are jump rates. This 
calculation shows the distribution of the first item; iterating and using the 
memoryless property of exponentials gives a product formula for the distri- 
bution of the full size biased permutation. 

For primes, the size of p is logp, and such a size biased permutation 
was used in the 1984 Ph.D. thesis of Eric Bach [13] to generate uniformly 
distributed random integers, factored into primes, and independently by 
Donnelly and Grimmett in 1993 [22] to give a simple proof of Billingsley's 
Poisson-Dirichlet limit for prime factors - discussed at (|77p below. In our 
context, the infinite random multiset of primes, there are infinitely many 
labels, but with probability one, the only limit point is zero, and there is a 
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largest label. Our size biased permutation starts with the prime having this 
largest label; listing the labels from largest downward tends to put smaller 
primes toward the front of the list. 



1.4. An example of the growth of an integer. Take for example an 
outcome of the experiment with 



(10) Z 2 = 3,Z 3 = 1,Z 5 = 0,Z 7 = l,Z n = 1, . . . 



and size biased permutation 



(11) Qi,Q 2 ,...,Q 6 ,... = 3,2,2,11,2,7,.... 



The first seven partial products, one through 3-2-2-11-2-7, shown on 
a logarithmic scale, are 

• • • • • • • i 

1 3 6 12 132 264 1848 24024 

The arrow pointing to 24024 = 1848 13 is to show that the next partial 
product will lie on or to the right of this location. We know this because 
the information 6 = ^2 p< i 3 Z p , together with the good luck that the first 
six primes in the size biased permutation are less than 13, implies that 
the seventh prime in the size biased permutation will be 13 or greater. In 
particular, for 1848 < n < 13 x 1848, we know that J(n) = 1848, even 
though we don't know the seventh prime in the size biased permutation, 
accounting for the last line of the table below. 

The jumps in J(n) are shown by double horizontal lines in the two tables 
below. 
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n 


J(n) 


1 + 7r(n/J(n)) 


Po(n) 


1 


1 


1 


1 


2 


1 


2 


1 if U < .5, 2 if U > .5 


3,4,5 


3 


1 


1 


6 to 11 


6 


1 


1 


12 to 23 


12 


1 


1 


24 to 35 


12 


2 


1 if U< .5,2 if U > .5 


36 to 59 


12 


3 


lif[/<l/3,...,3iff/>2/3 


60 to 83 


12 


4 


1 if 17 < 1/4, . . . , 5 if 17 > 3/4 


84 to 131 


12 


5 


1,2,3,5, or 7 


132 to 263 


132 


1 


1 


[264, 2 • 264) 


264 


1 


1 


[2 • 264, 3 • 264) 


264 


2 


1 if U< .5,2 if U> .5 


[3 • 264, 5 • 264) 


264 


3 


lif£/<l/3,...,3if£/>2/3 


[1320, 1848) 


264 


4 


1, 2, 3, or 5 


[1848, 13 • 1848) 


1848 


1,2,3,4,5, or 6 


1,2,3,5,7, or 11 



To continue the above example, we will consider three cases: the first 
being U G (.5, .6], the next being U G (.6, 2/3], and the last being U > 5/6. 





.5 < U < .6 


.6 < U < 2/3 


5/6 < U 


n 


Po(n) 


J(n)P (n) 


Po(n) 


J(n)P (n) 


Po(n) 


J(n)P (n) 


1 


1 


1 


1 


1 


1 


1 


2 


2 


2 


2 


2 


2 


2 


3,4,5 


1 


3 


1 


3 


1 


3 


[6,11) 


1 


6 


1 


6 


1 


6 


[12,23) 


1 


12 


1 


12 


1 


12 


[24,35) 


2 


24 


2 


24 


2 


24 


[36,59) 


2 


24 


2 


24 


3 


36 


[60,83) 


3 


36 


3 


36 


5 


60 


[84,131) 


3 


36 


5 


60 


7 


84 


[132,263) 


1 


132 


1 


132 


1 


132 


[264,528) 


1 


264 


1 


264 


1 


264 


[528, 792) 


2 


528 


2 


528 


2 


528 


[792, 1320) 


2 


528 


2 


528 


3 


792 


[1320,1848) 


3 


792 


3 


792 


5 


1320 



Recall that d,Tv{J{ n )Po{n), N(n)) — > 0, but the random integer we grow 
is not exactly uniform. For comparison, random permutations have similar 
behavior and can be grown exactly. Eric Bach's procedure gets an exactly 
uniform random integer, but not with n evolving. 
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1.5. Notions of distance. 

1.5.1. The expected number of insertions and deletions needed. 

The material below on dw was delivered at the workshop at the start of 
Lecture 2; the material on <1tv was drawn out in workshop conversations 
with Joel Spencer, and I prepared a transparency for the lecture but did not 
deliver it, for lack of time. 

We consider the metric d on positive integers which counts the number 
of changes needed to convert the prime factorization of one integer into 
that of the other. For example, d(40,500) = d(2 3 5 1 , 2 2 5 3 ) = 3, d(8,3) = 
4, and d(i,ip) = 1 for any integer i and prime p. Writing for the 

greatest common divisor, and 0(i) for the number of prime factors, including 
multiplicities, we have in general 

If we think of converting j to i, the first term above is the number of inser- 
tions needed, and the second term is the number of deletions, so we think 
of d as the insertion/deletion distance, analogous to the string edit distance 
of Levenstein [36J or Ulam [16] : see the book of Kruskal and Sankoff [33] for 
more history. 

For the sake of comparing the uniform random N(n) with the infinite 
random multiset of primes, clearly primes p > n should not be considered. 
Thus, we code up the relevant part of the multiset by defining 

(12) M(n) := ~^\_P Z '\ with Z2, Z3, . . . independent geometric. 

p<n 

Recall that 

N(n) = JJp c f( n ) = Y[ p Cp(ra) is uniform Hon. 

The insertion-deletion distance between these two random integers is a 
random, nonnegative integer 

d{N > M) = n iw^f) = 5 lCp{n) - ZpV 

The Wasserstein distance between two random objects M and N, for a given 
metric d, is by definition the infimum, over all couplings, of the expected 
value of d(N,M). Recall that a coupling means a construction of M and 
N simultaneously on a single probability space; it is understood that the 
marginal distributions for M and for N have been specified in advance, 
but there is no other constraint on their joint distribution. A compactness 
argument shows that the inf is achieved; see for example [23] . To emphasize 
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the role of the parameter n, which determines the marginal distributions of 
N(n) and M(n), we define 

(13) d w (n):= min E d(N(n),M(n)). 

couplings 

Our result is 

(14) lim d w (n) = 2. 

n— >oo 

We will prove the hard part of this, that lim sup d\y(n) < 2, in Lecture 
3, essentially by analyzing the growth of a random integer. The matching 
lower bound, from \2\, is that liminf d\y(n) > 2; this is relatively easy, and 
the reader is challenged to discover a proof, before considering the hint given 
by the paragraph following (I72D . A related discussion appears in section 22 

of m- 

For perspective on the content of (fl4|) . we note that even the bound 
dw{n) = 0(1) is very strong. For instance, by comparing (C p (n)) p < n with 
the independent process (Z p ) p < n , the following consequences can be derived 
easily: (see [2]) 

dw(n) = o(loglogra) implies the Hardy-Ramanujan Theorem for the nor- 
mal order of the number of prime divisors. 

dw( n ) = o(y / log log n) implies the Erdos-Kac Central Limit Theorem. 

dw( n ) = 0(1) gives another proof of the "conjecture of LeVeque," that 
the error in the Central Limit Theorem is 0(l/\/log logn); the first proof 
was given by Renyi and Turan in 1957 [40| . 

dw( n ) = o( logloglogn) yields that the optimal rate in the simplest case of 
the Brownian motion convergence of Billingsley and Philipp, |18| . 1 19 |. [38] . for 
the expected sup norm, is order of logloglog n/^/log log n, and no smaller. 
The underlying idea is that for coupling Brownian motion with the rate one 
(centered) Poisson process, with both processes run until the variance is t, 
and without rescaling by \/t, the coupling distance grows on the order of 
logt. For primes this is applied with t := ^2 p<n l/p ~ log logn. See Kurtz 
(1978) [35] for the upper bound, Rio (1994) [H] for the lower bound, and 
[i2j for the connection with primes. 

1.5.2. The total variation distance. This section gives a "one trans- 
parency" overview of the situation involving total variation distance for 
primes. 

NOTATION: (3 £ [0, 1], fixed or f3 = f3(n) ^0; u = 1//3. 

(15) d TV := d T v{nP,n) := min P ( {C p (n)) <n p / (Z p ) <n p) . 

couplings 

Kubilius [34J in the 1950's showed a) below with an upper bound of the 
form exp(— cu), Barban and Vinogradov [T3] improved this to the form 
exp(— cm log u), and Elliott [24] gave the particular constants in b) below. 
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a) If P — > then drv 0, and 

b) d T v = O (exp(-|iilogu) + n~ 1 / 15 ). 

Elliott |24| gives a partial converse: if /3 — > 1 then dxv 0, and it is not 
hard to see the full converse, that drpv implies /3 — > 0. In Spring 1996 
I stated a conjecture: that for fixed P, 

c) H (J3) := lim n _ 5 . 00 dxv exists, and is the same as the limit for permuta- 
tions. This limit is given explicitly as an integral in [IT], and proved to be 
the limit for permutations in Stark's 1994 PhD Thesis; see |43j . 

Later in 1996 Tenenbaum [45] improved b) to 

d T v = O (exp(-ii(logii + log log u - (1 + log 2 + e))) + n e_1 ) 

and Arratia and Stark [10] proved c) . Soon after Tenenbaum [15] also proved 
c), with a rate. The limit was further identified [8] as a distance between the 
restrictions to [0, /3] of two processes which will be discussed in lectures 3 and 
4, the Poisson-Dirichlet process with parameter 1, and the scale invariant 
Poisson process with intensity (1/x) dx: for every (3 £ [0, 1], 

(16) H(P) = d TV {{V t : Vi < p}, {X, : X t < p}) 
= min ¥({V i :V i <P}^{X i :X i <P}). 

couplings 

1.5.3. The bound from taking absolute values inside. The crude pro- 
cedure of "taking the absolute values inside" described following shows 
that the total variation distance dTv{ n ■ , n ) is a t most u(n^,n) where 

(17) «<m)4 E 

d>l:P+(d)<b 

The notation here is {x} for the fractional part of x, oo{d) for the num- 
ber of distinct prime factors of d, and P + (d) for the largest prime fac- 
tor of d, with P + (l) = 1. Analysis of u(b,n) (see [2]) shows that when 
b, n — > oo together, the threshold for whether u(b, n) tends to zero or 
infinity is log b = (.5 ± e)logn logloglogn/loglogn. In fact if log b < 
logn logloglog nj (cloglogn) with c > 2+a > 2, then u(b, n) = o ((log n)~ a ) , 
while if log b > logn logloglog n/(c loglog n), for c < 2, then u(b,n) — > oo. 
While much weaker than the upper bound of Kubilius, the upper bound 
u(b, n) on the total variation distance has the virtue that it also serves as 
an upper bound on the Wasserstein distance: for all 1 < b < n, 

(18) dTv(b,n) < dw(b,n) < u(b,n), 

where dw (b, n) is the minimum expected number of insertions and deletions 
needed to convert Yl P <bP Cp ^ ^° Yl P <bP Zp - [The dw(n) in (fT3"|) is the special 
case b = n of this, i.e. d\y(n) = d\y(n,n).] While we know the limit of f)16|) 
for dTv( n ^ \ n ) as n — )• oo with /3 G (0, 1) fixed, the corresponding behavior 
of dw(b, n) remains unknown. 
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Open question What is the limit of dw{n^ ,n) as n — > oo for fixed < 
P< 1? 

We know that for /3 = 1, the limit is 2 — this is For all /3 £ [0, 1], it 
can be seen from the coupling in Lecture 4 that limsupdiy(n. /3 ,n) < 2/3 - 
but only for j3 = 1 is there a matching lower bound. The natural candidate 
for the limit of the distance is the Wasserstein distance for the limit systems, 

(19) H w (/3) := d w ({Vi : V { < /?}, {X { : X t < /?}) 
= min E\{Vi:Vi< (3}A{Xi-. Xi< /3}\. 

couplings 

Thus, the open question involves two tasks. First, prove (or disprove!) that 
the limit exists and equals H\y(/3). 

Conjecture 0. V/3 G [0,1], H w (/3) = \mxd w {n^, n). 

The second task in our open question is to find an explicit formula for 
H w ((3). We know only H w (l) = 2, H w (/3) < 2/3, and trivially, H w (0) = 
and Hw is monotone. 

2. Lecture 2. Growing a random permutation 

This is a fresh start — the following can be understood easily from scratch, 
without worrying about the connection with prime factorizations. However, 
our example for permutations below has been carefully cooked to match the 
example for primes in Lecture 1. 

Write Ci(n) for the number of cycles of length i in our random permu- 
tation 7r G S n , so that always Cj(n) = if i > n, and n = ^iCj(n). The 
analog of Theorem [U provable easily with inclusion-exclusion, is that as 

n — > oo 

(20) (Ci(n))i (Zi)i = (Z x , Z 2 , Z 3 , Z 4 , . . .) 

with independent coordinates Zi, Poisson distributed with E Zi = (Note, 
we are deliberately re-using the same notation that we used for prime fac- 
torizations, although the index must be a prime in the latter case. To 
contrast the two situations, Z p is either geometrically distributed, with 
P(Z p > k) = p~ k and K Z p = l/(p — 1), or else Z p is Poisson, with 
KZ p = l/p.) That there is an independent process limit, without rescaling, 
for both prime factorizations of random integers, and the cycle structure of 
random permutations, is the most basic ingredient in the analogy between 
these two structures. 

Historical notes: the similarity between primes and permutations appears 
to have been noted first in 1976 by Knuth and Trabb Pardo [32]. The 
similarity they note is based on the common limit behavior for the i th largest 
prime factor and the i th longest cycle; further aspects of the similarity, are 
discussed in [6], which sets forth the role of "conditioning" the independent 
limit process as the fundamental reason for this similarity. In contrast, in 
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these lectures we will focus on the closeness of primes and permutations 
to the scale invariant Poisson process, with its relations involving spacings 
and size biased permutations, as the underlying reason for the similarity. 
A reasonable attribution for the construction below, based on canonical 
cycle notation, is Feller 1945 [26], but the explicit connection with cycle 
lengths seems to appear first in the unpublished lecture notes [20]; it was 
independently discovered in [15] and elaborated on in [4], which attached 
the name "Feller coupling" to this construction. 

Start with the canonical cycle notation for a permutation ir. For example, 
the permutation tt with 1 i — )► 5, 2 i — )► 2, 3 i — >- 1, 4 i — > 4, 5 i — >- 3, 6 i — >- 7, T i — >• 6 is 
written as tt = (153) (2) (4) (67). In writing the canonical cycle notation for 
a random tt £ £7, one always starts with "(1 ", and then makes a seven- 
way choice, between "(1)(2 ", "(1 2 and "(1 7 ". One continues 
with a six- way choice, a five- way choice, . . ., a two-way choice, and finally a 
one-way choice. 

Let £i be defined as the indicator function £j = 1( close off a cycle when 
there is an i-way choice). Thus 

1 i — 1 

P(£i = 1) = -, P(£j = 0) = — : — , and £,i,(,2, ■ ■ ■ ,£n are independent. 

1 1 

An easy way to see the independence of the £j is to take D{ chosen from 
1 to % to make the i-way choice, so that £j = \{D% = 1). Absolutely no 
computation is needed to verify that the map constructing canonical cycle 
notation, (D%, D2, ■ ■ ■ , D n ) 1— > tt, from [1] x [2] x • • • x [n] to S n , is a bijection. 
The "decision" variables D\, D2, ■ ■ ■ , D n determine the random permutation 
on n points, while the Bernoulli variables £i,£2 ; • • • ,£,n determine the cycle 
structure, and something more — a size biased permutation SB(n) of the 
cycle lengths! The total number of cycles is 

K n := # cycles = 6+6+- • with EK n = 1+-+- • •+- = 7 +logn+o(l). 

2 n 

[Incidentally, the relations above give a quick and dirty, but pretty, way 
to see that the entropy h((Ci(n))i) for the cycle structure of a random per- 
mutation of n objects is asymptotically (logn) 2 /2. Namely, the entropy 
of a size biased permutation of k objects is at most the entropy of a uni- 
form permutation of k objects, which is log A;! < A; log A;. Thus the entropy 
/i(SB(n)) of a size biased permutation of the K n cycle lengths of a ran- 
dom n-permutation is at most M K n log K n ~ lognloglogn. The entropy 
of the Bernoulli(l/z) random variable & is hfa) = — (ilogi + ^log^-) 
= logi — log(i — 1) + j log(z — 1) for i > 2, so 

Kti) = £>g — - + - log(z - 1)) = log(n + 1) + ~h ~ ^T^- 
1 2 1 1 1 

Since the cycle structure together with the size biased permutation of the cy- 
cle lengths determine £1, . . . ,£ n , we have (logn) 2 /2 ~ Ysl M£i) = ^((Ci( n ))«)+ 
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MSB(n)) = h((Ci(n))i) + O(lognloglogn), hence h{{C l {n)) i ) ~ (logn) 2 /2. 
As an exercise, the reader can try to verify this directly from Cauchy's for- 
mula, and perhaps (open problem?) give an asymptotic expansion!] 

The coupling which "grows" a random permutation requires a very simple 
idea: for all values of n, use the same Di, D2, ■ ■ ■ , and hence the same 
••• • 

The Feller coupling, as motivated by the process of writing out canonical 
cycle notation, "reads" £1^2 • • • £n from right to left: the length of the first 
cycle is the waiting time to the first one, the length of the next cycle is the 
waiting time to the next one, and so on. The multiset of cycle lengths can be 
determined without regard to right or left: every i-spacing in 1^2^3 • • "Cnlj 
that is, every pattern of two ones separated by % — 1 zeros, corresponds to a 
cycle of length i. The spacing from the rightmost one in 1^2^3 • • • £n to the 
"artificial" one at position n + 1 corresponds to the first cycle in canonical 
cycle notation, and also to the factor Pq in our construction for growing an 
almost uniform random integer J(n)Po(n). 

Since we are interested in growing with n, we will always read ^i^2?3 • • • 
from left to right. Recall that £1 = 1 identically, so £i£2£3 ■ • ■ is a random 
infinite word in the alphabet {0,1}, starting with a 1. Almost surely, this se- 
quence has infinitely many ones, since the £j are independent with Yli>i 
= X^i>i V* = 00 • We define the inter-one spacings E>i,E>2, . . . € N by the 
requirement that 

£i£2^3 • • • has ones at 1, 1 + B\, 1 + B\ + B2, . . . , and nowhere else. 

The example which matches the example (jlip from Lecture 1, which was 
3,2,2, 11, 2, 7,p with p > 13, and partial products 1,3,6,12,132,264,1848, 1848p, 
is the sequence 

(21) £i£ 2 • • • 60 = 10111000011000100000, 



— » • — • — » • • • i — 

13 4 5 10 11 15 21 

or equivalently, 

B 1 ,...,B 6 ,B 7 = 2, 1,1, 5, 1,4, £ 7 , with B 7 > 6. 

Do you see the correspondence between 3,2,2,11,2,7 ,p with p > 13, and 
10111000011000100000 • • • ? Writing p { for i th smallest prime p h the prime 
and permutations examples match in that the k th prime on the list of primes 
is p Bk - 
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In the Feller coupling, the cycle structure of a random ir £ S n has been 
realized via C{(n) = #i-spacings in 1^2^3 " " "£nl- For comparison, consider 



An easy calculation, (|22H and (I23p below, shows P(Cj(ra) 7^ Cj(oo)) < 
2/(n + 1). Recall that convergence in distribution for infinite sequences, as 
in (I20p . is equivalent to having convergence for the restriction to the first k 
coordinates, for all k. Thus 



for all k, so that (Ci(n), . . . , C&(n)) =^ {Z\, . . . , Z^), and hence 

(Ci(n), C2(n), . . .) =^ (Ci(oo), C2(oo), . . .). Comparison with (|20j) shows 

that the Cj(oo) are independent, Poisson, with ECj(oo) = 1/i. 

2.1. A size biased permutation of the multiset having i with multi- 
plicity Zi ~ Poisson(l/i). The above indirect argument, that the Cj(oo) 
are independent, Poisson(l/z) was presented at Oberwolfach one morning in 
August 1993; Erdos and Svante Janson were in the audience. Svante asked 
if there were a direct proof; I said I didn't know of one. Before lunch time, 
Svante Janson found and presented the following direct argument. Start 
with the Zi, given to be independent, Poisson(l/i). Take a random infinite 
multiset, having Z{ copies of Sj := J_1 1, the string of length and weight 
i. Take a size biased permutation of this multiset to get a list R\, R2, ■ ■ ., 
so that by construction, the number of i-spacings in the string IR1R2 • • • is 
Zi, for each i. Calculation ([H], section 9.1) shows that the random string 
IR1R2 ■ ■ ■ of zeros and ones has the same distribution as ^1^2 • • • • And 
as an historical note: there already existed yet another direct argument, 
by marking Poisson processes. Jim Pitman describes it this way: "As ob- 
served in Diaconis-Pitman [20], the fact that the numbers of i-spacings in 
the Bernoulli (1/j) sequence are independent Poisson (1/i) is an immediate 
consequence of the structure of records of a sequence of i.i.d. uniform (0,1) 
variables U\, U2, ■ ■ ■ ■ For if Aq < N2 < • • • are the successive record indices, 
then by a well known result of Renyi the indicators 1(A^ = j for some k) 
are independent Bernoulli (1/i), and as shown by Ignatov |29j the numbers 
of i-spacings in the record sequence are independent Poisson(l/i)." 

2.2. Keeping score for the Feller coupling. Starting from the indepen- 
dent Bernoulli £j with P(£j = 1) = and defining Zi as Zi := Cj(oo), we 
have a coupling of the cycle structures (Cj(n))j for n = 1,2,..., together 
with the independent Poisson Zi with EZj = Note that for the event 
that an i-spacing occurs with right end at k, the probability is a simple 
telescoping product: 




P( (Ci(n),...,C fc (n)) ^ (Z 1 ,...,Z k ) ) < 2k/(n + l) ^0 



n&- l ---tk = io i - i i) 



1 k-i k-3k-2 1 1 



k-i k - (i - 1) " ' k - 2 k - 1 k (k — l)k' 
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We can have Cj(rt) < Zj, due to z-spacings whose right end occurs after 
position n + 1; the expected number of times this occurs is 

(22) £ P(&„ i ... & = io-i)= £ (^ = ^r 

fe>n+l k>n+l v 7 

The only way that Cj(re) > Zj can occur is if the "artificial" one at position 
n + 1 in 1^2 • • • £nl creates an extra i-spacing; for each n this can occur for 
at most one i, and it occurs for each 1 < i < n with the same probability, 
(23) 

Pfcn-m • • • Un + 1 = 1 0^) = -—r = 

n — i + ln — i + 2 n + 1 re + 1 

The length A{n) of the first cycle in canonical cycle notation is precisely 
the value i for which this "extra" i-cycle may occur, 

A{n) = n + 1 — max{j < n: £j = 1}, 

so that with 



(24) 



J(n):=max{j>: ^^<n-l| 



we have J(n) + ^4(n) = n. 

We now summarize the Feller coupling in a way which matches the cou- 
pling for primes in subsection 11.21 Start with independent Poisson ran- 
dom variables Zi with EZj = Take the infinite multiset having Z{ 
copies of i. Take a size biased permutation JBi, JB2,. . . of this multiset. Let 
J(n) £ [0,n — 1] be the largest partial sum not exceeding n — 1. (There is 
no need to calculate the distribution of J(n), but it happens to be exactly 
uniform on 0, 1, . . . , n — 1, which matches the harmonic distribution in ([7|), 
in the sense that log H(n) is approximately uniform on [0, log n].) Fill in one 
extra cycle length, A{n) := n — J(n); this corresponds to Po(n). The result- 
ing cycle structure, with cycles of lengths B\, B2, ■ . . , Bl and A(n) (where 
J(n) = B\ + - • - + Bl), is the cycle structure of a random permutation chosen 
uniformly from S n . 

Write e, := (0, 0, . . . , 0, 1, 0, . . .) for the unit vector with all zero coordi- 
nates except for a one in position i. The Feller coupling shows that the 
Wasserstein insertion-deletion distance for permutations compared to their 
independent limit process is at most 2, for all n, with a monotonicity relation 
as a bonus: 

(25) E \Ci{n) - Zi\ < — — < 2, and 



(26) always (Ci(ra), C 2 (n), . . .) < (Zi, Z 2 , . . .) + e A(n) 
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The result (|26h for random permutations is stronger than the analogous 
result for prime factorizations, ([9]), in that there is no exceptionally proba- 
bility for an event where monotonicity may fail. The result for permutations 
suggests the following conjecture from [6], for which, in the style of Erdos, 
I now offer a five hundred dollar prize. 

Conjecture 1. ($500 prize offered) For all n > 1, it is possible to con- 
struct N(n) uniformly distributed from 1 to n, M(n) defined by and a 
prime P(n) such that 

(27) always N(n) \ M(n)P(n). 

My reason for believing the conjecture to be true is that permutations "fit 
together perfectly," as witnessed by the Feller coupling, and that primes do 
so also. One sense in which "primes fit together perfectly" is that the weights 
log 2, log 3, log 5, log 7, log 11, . . . are such that 1) all multisets of primes have 
distinct weights, and 2) the weights of these multisets are evenly spaced: 
log 1, log 2, log 3, log 4, .... 

A restatement of Conjectured] in the language of stochastic monotonicity, 
with respect to the partial order of divisors and multiples, is that for every 
n, for some randomized choice of Po(n) to be 1 or a prime factor of N(n), 
N(n)/Po(n) lies below M(n) in distribution. A more specific version of the 
conjecture, from [2], is that Po(n) can be chosen as the first prime factor of 
N(n) under a size biased permutation. 

In terms of the usual combinatorial language of matchings, Conjecture Q] 
may be stated as follows. For any set D of positive integers, define 

1(D) := {i : 3m G D, p prime, i \ mp}, 

so that for example Z({1}) is the set of primes, together with 1. Write [n] 
for {1,2,..., n}. The conjecture is that Vn > 1, \/D C [n], 

- | 1(D) n [n] | > V P(M(n) = m) = TT ( 1 - - ] V — . 
n ' - LJ - \ pi z — ' m 

m<=D p<n v r/ rneD 

2.3. At least 2 — e indels are needed. The following extensions to (|25p 
aren't obvious, but help give a full picture of the qualitative behavior of 
the Feller coupling. Namely, the positive and negative parts of the quantity 
inside the absolute value in the left side of (|25p have 

n 

(Ci(n) — Zi) + — > 1 in probability and expectation, and 
l 

for k = 0,l,2,...,P^J2(Zi-Ci(n)) + = kj -> p k > 0, with ^ kp k = 1. 

In words, the coupling converts (C±(n), . . . , C n (n)) to (Z±, . . . , Z n ) with, in 
the limit, one deletion and a random, mean 1 number of insertions, using k 
insertions with probability p k ■ 
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A separate argument (see [2]) shows that, for any coupling, Feller or other- 
wise, with probability approaching one, at least one deletion is necessary, i.e. 
1 = \imP(Y^i(Ci(n) - Zi)+ > 1). In particular, 1 < liminf E £"(Ci(n) - 
Zi) + . Then, since ECj(n) = 1/n = E27j for i = 1 to n, the number of 
insertions must, on average, equal the number of deletions. Hence 1 < 
liminf E ^21(Zi — Cj(n)) + , so that 2 < liminf E Y2i |C^(n) — 27j|. This shows 
that the Feller coupling, from the point of view of ()25f) . is asymptotically 
optimal. 

3. Lecture 3. Rescaling space - to get a scale invariant 

poisson process 

3.1. Review: couplings for primes and permutations. To review, we 
have two couplings for growing a system with parameter n. The coupling 
for primes grows a random integer J(n)Po( n ) which is almost uniformly 
distributed from 1 to n. The coupling for permutations grows a random 
permutation which is distributed exactly uniformly in S n . 

The coupling for primes takes a list Qi,Q2, ■ ■ ■ of primes, forms their 
partial products, and pays attention to J(n) = Q1Q2 • • • Ql{tl)i the largest 
partial product not exceeding n. On a logarithmic scale, this means that 
we plot the points 0,logQi, logQi + logd^, • • •, and consider the largest 
point not exceeding logn. Think of the values logQj as spacings. We finish 
by filling in one extra spacing, logPo( n )j to get to the point log(J(n)Po(n)) 
close to, but not to the right of, logn. The resulting integer is J(n)Po( n ) = 
Q1Q2 • • • QlPo, with either L or L + 1 prime factors, depending on whether 
Po = 1 or not. 

The coupling for permutations takes a list B\, B2, . . . of positive integers, 
forms their partial sums, and pays attention to J(n), the largest partial sum 
not exceeding n — 1. We then fill in one extra spacing, of size A(n), to get 
to the point n = J(n) + A(n) = B\ + E>2 + • • • + Bl + A{n). The resulting 
cycle structure has L(n) + 1 cycles. 

We review once again, looking only at the sequence of points plotted, 
and their spacings. For primes, the points plotted are 0, logQi, logQi + 
log Q2, • • ., with spacings log Qi, log Q2, log Q3, .... Conditional on the mul- 
tiset of spacings, the order in which they are taken is given by a size biased 
permutation. For each prime p, the number of k such that log Qk = logp 
is Z p , and the Z p are independent, geometrically distributed. For permuta- 
tions, the points plotted are 0, B\, B\ + B2, ■ ■ ., with spacings Bi, B2, B3, . . .. 
Conditional on the multiset of spacings, the order in which they are taken is 
given by a size biased permutation. For each positive integer i, the number 
of k such that B^ = i is Zi, and the 27$ are independent, Poisson(l/i). 

The underlying reason that primes and permutations have similar behav- 
ior is that for both systems, the spacings have the same logarithmic property: 
the expected total number of spacings of size at most x grows like logx. For 
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primes, this is the property that 

E £ Z v = Y,JZ\ ~ l ° gX 

logp<x p<e x 

and for permutations, 



% 

i<x i<x 



We pursue this further; even the expression 7 + log 2 + o(l) is common to 
the two systems — see (I4T1) . 



3.2. Limits after scaling space. What does the sequence £1^2 ■ ■ ■ look 
like when viewed from a distance? Encode ^1^2 • • • € {0, 1} 00 as the point 
process, i.e random measure, 

i>l 

where <5(i) is the measure placing unit mass at i. Our example was £1^2 • • • = 
10111000011000100000 • • • < — > 5(l)+i5(3)+5(4)+<5(5)+(5(10)+(5(ll)+5(15)+- 

t t t t LJ 1 

13 4 5 10 11 15 

We rescale space: divide the locations by x, to get ^2i^iS(i/x) and take 
x — > 00. The limit is the "scale invariant" Poisson process X on (0, 00) with 
intensity — dx. The intensity gives the expected number of points in any 
interval (a, b) with < a < b < 00, which is 

EX(a,b) = f - dx = log(6/o). 



The Poisson property is that for any disjoint ii,/2> ■ ■ ■ C (0,oo), the counts 
of points in these sets, X(Ii), Xfa), ■ ■ ■ are independent, and Poisson dis- 
tributed. In particular, for < a < b < 00 

¥(X(a, b) = 0) = exp(-E;f(a, b)) = | . 

Likewise, consider the point process ^ZjJj. This process encodes the 
multiset of spacings used in the Feller coupling, without keeping track of 
the order in which the spacings are used. In our example, ^1^2 • • • = 
10111000011000100000- •• , which implies Z x > 3, Z 2 > 1,Z 4 >1,Z S > 1. 
We now reveal more about the outcome in this example, and declaring that 
Z\ = 3, Zi = 1, Z3 = 0, Z4 = l,Zs = 1, which corresponds to the example 
for primes, (fTOl) . And to have a nice picture, we also declare that Z{ = for 
i = 6 to 13, Z14 = 2, Z15 = Z\§ = 0, Z17 = 1, and Zi = for i = 18 to 25. 
Our random measure is 
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Zt 5{i) = 35(1) + 5(2) + 5(4) + 5(5) + 5(14) + 5(17) + • • • . 



i>l 



t t t I t 



1 2 4 5 14 17 



This process rescaled, Yl Z% 5(i/x), converges in distribution to X: 



(28) 



2^Zi5(i/x) ^X as x — > oo. 

»>i 



Recall that Zi is the number of 2-spacings of ^1^2 ■ • ■ • We have a process £ 
whose rescaled limit is X , and the spacings of this process £ also have rescaled 
limit X. Hence it is plausible to guess that X is equal in distribution to its 
own process of spacings. 

3.3. The scale invariant spacing lemma. Indeed, the process X is equal 
in distribution to its own spacings. Since X has no multiple points (unlike 
Y, Zi6(i), which can have multiple points), X can be encoded as a random 
set X = {Xf i £ Z} C (0, 00), with the points indexed so that for all 
Xi < With such indexing, the spacings are the points 



and the statement that the spacings of X have the same distribution as X 
is 



The proof of this, from [2], is inspired by Janson's proof in section 12.11 
that if the multiset with Zi copies of i, to be used as spacings, is placed in a 
size biased permutation, then the set of points is encoded by the Bernoulli 
(1/i) sequence 1£2^3 • • • • Viewing this in terms of the limits under spatial 
rescaling, it suggests that if the points of X, to be used as spacings, are 
placed in a size biased permutation, then the set of partial sums also is 
distributed as the scale invariant Poisson process. 

The size biased permutation of X has a very pleasing, symmetric ex- 
pression. Recall from section 11.31 that a size biased permutation can be 
generated by using, for an object of weight x, an exponentially distributed 
label W, with density fw(w) = xe~ wx . To emphasize that the weights 
are also random, and we have conditioned on the values of these weights, 
we write this with the notation for a conditional density given x, that is 
fw\x( w \ x ) dw = xe~ wx dw. Starting with the scale invariant Poisson pro- 
cess X = {Xi}, with intensity fx(x) dx = (l/x) dx on (0, 00), and attaching 



Yi '■— Xi + i — Xi 
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{Yi : i G Z} = X, and 1 = P(Y; ^ Yj Vi ^ j € Z). 
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label Wi to Xi, the process {(Wi,Xi) : i E Z} of points with labels is a Pois- 
son process with intensity fx(x) dx fw\x( w \ x ) dw = 

fw,x(w,x) dw dx = e~ wx dw dx, (w,x) 6 (0,oo) 2 . 

Note the beautiful and perfect symmetry between the points and labels: 
the distribution of the process is invariant under (w,x) h-> (x,w). No such 
symmetry is possible for our other size biased permutations, since the points 
have discrete support, such as N or {logp : p is prime }, while the labels are 
continuously distributed in (0, oo). The distribution of the process is also 
invariant under rescaling of the form (w, x) \-t (cw,x/c); we apply this with 
c = 2 in the picture below, and with c = log n in the proof of Lemma [2j 

The picture shows the Poisson process with intensity e~ wx restricted to 
the region (0, 6] 2 with 6 = 5. The number of points in this region is Poisson 
with mean J (1 — e~ bx ) dx; this mean is 3.796 for 6 = 5, and would be 8.401 
for 6 = 50. 



o° o 



• v- O 



We show three realizations; the first experiment, shown with solid circles, 
had 4 points in this region, the second experiment, shown with open circles 
had 10 points, and the third experiment, shown with "x" had 3 points. This 
is honest simulation; I did only three runs, even though they hardly look 
typical to my naive eye. Try to visualize each of the three runs by itself. 

The proof of the scale invariant spacing lemma goes as follows. Start 
with the point process on (0, oo) 2 with intensity e~ wy dw dy, and index 
the points as (Wi,Yi) with W\ > Wj+i for all i £ Z. (The labels W» are 
all distinct, with probability one, and we remove from the probability space 
the complementary event.) Notice the reverse direction for the deterministic 
inequality; small indices i < — > large labels Wj, which tend to go with small 
weights Yi. This gives us a set of points {Y{\ having the distribution of the 
scale invariant Poisson process, indexed in order of a size biased permutation, 
tending from small to large. Define the points Xj to be the partial sums of 
the Yi in this order: 



for i G Z, Xj := ^ Y { 



-oo<i<j 
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This gives a set of points < • • • < Xi < Xi+i < ■ ■ ■ < oo, and further 
calculation shows that the distribution of {Xi : i G Z} is that of a scale 
invariant Poisson process with intensity (1/x) dx. By construction, the 
spacings of the Xi are the points Yi, with the desired distribution. 

3.4. Primes and the scale invariant Poisson. How does the process of 
points in the coupling for primes appear, viewed from a distance? Recall, the 
points are 0, log Q\, log Qi+log Q2,--- , and their spacings, log Q\, log Q2, ■ ■ ■ 
are such that logp occurs Z p times, where the Z p are independent and 
geometrically distributed. A process which encodes the multiset of spacings, 
in a form suitable for spatial rescaling, is the random measure 

(29) ^(logQ 4 ) = £z p $(logp). 

i>l p 

The direct analog of (|28|) would be the statement that 

(30) Z p 6(log p/x) =>■ X as x — ^ 00. 
p 

Standard probability theory reduces this to showing that for < a < b < 00, 
the expected mass that the rescaled measure gives to (a, b) converges to 
log(6/a) as x -> 00. The mass is Eax<io g p<fer E Z p = E e -< P < e ^ V(p ~ 1), 
so that sufficient number-theoretic knowledge needed to prove (f30|) is that as 
y — > 00, Ylip<y VP = B + l°g l°gy + f° r some constant B, a statement 
weaker than the prime number theorem. 

But the random measure ^2 p Z p 5(logp) =^ i>1 5 (log Qi) is much closer 
to X than the rescaling relation (|30p shows. Namely, we can match up 
points so that the total amount of displacement needed to convert one se- 
quence to the other is finite (in expectation, and therefore almost surely.) 
This coupling comes from [2], where it is combined with the total variation 
distance approximation of the small prime factors of a uniform integer, to 
give a metrized version of the Poisson process approximation for the "inter- 
mediate" prime divisors, from De Koninck and Galambos [25]. The points 
Xi of X are indexed by i G Z, while the points Qi are indexed by i G N, so 
we define Qi = 1 for i < 0. The claim is that we can construct the Qi and 
Xi on a single probability space, so that 

(31) E^2\Xi- logQil < 00 . 

3.4.1. Ignoring the difference between geometric and Poisson. In 
(I31|) the chief obstacle is that X is intrinsically a Poisson process, while the 
prescribed multiplicity of p in the sequence Qi is not Poisson, but rather a 
geometric, Z p . We make our task easier if we change the prescription to the 
following: for every p, A p = ^ l(Qi = p), where the A p are independent, 
Poisson(l/p). The two versions of the prescribed counts, Z p and A p are close, 
and can be coupled with E,\Z p — A p \ = l/(p(p— 1)). To convert the coupling 
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with Poisson multiplicities into a coupling with geometric multiplicities, we 
can move \Z P — A p \ copies of logp, each at most through distance logp, 
because there is an infinite supply of points log Qj = to swap with. Since 

(32) £logp/(p(p-l))<oo, 

this perturbation is absorbed by the right side of (j3Tj) . 

A simple way to handle the scale invariant Poisson process on (0, oo) is to 
start with the translation invariant Poisson C with intensity 1 dx on M — a 
process for which the number of points in an interval of length x is Poisson 
distributed, with mean x. If the points of this process are Lj for i £ Z, then 
setting 

(33) X % := e L > 

gives the points of the scale invariant Poisson process X. To get A p to be 
Poisson with mean 1/p, all we need to do is assign some interval of length 
1/p, and let A p count how many points of C land in that interval. Using 
disjoint intervals for different p makes the A p mutually independent. 

The error estimate for the prime number theorem (see e.g. [42J, or [44 
section 4.1) implies the well known estimate 

(34) ^ - = B + log log x + O ^exp(— c^/loga 

p<x 

as x — )■ oo, for some c > 0, with constant 

S:= ^-EEA =-261497, 
where 7 is Euler's constant, 7 = .5772 . We use this in the form 

(35) f(x) := -B + = log x + O (exp(-cVx )) . 

p<e x 

The best upper bound for ([34"|) is 0(exp(— c(logx) 3//5 (loglogx)~ 1 / 5 )), but 
more easily stated estimate leads to the same order error bounds in our 
work. 

We define a function g : R — > {0, log 2, log 3, . . .} to be essentially the 
inverse of /. Specifically, g has (— 00, — B] h-> 0, (—B,—B + 1/2] i-> log 2, 
(— B+l/2, — £+1/2+1/3] i->- log3, . . . . Since / is close to the log function, g 
is close to the exponential function. Now let h : (0, 00) — >• {0, log 2, log 3, . . .} 
be defined by h(x) := <?(logx), so that h is close to the identity func- 
tion on (0,oo), and can be applied directly to the points X{ of the scale 
invariant Poisson process on (0, 00). We have, under h, (0,e~ B ] 1— > 0, 
(e- B ,e~ B+1/2 ] ^ log 2, ( e -*+i/2 >e -B+i/2+i/3] log 3,... . This map 
h is such that the number of X{ with h(X{) = logp is Poisson(l/p), inde- 
pendently for all primes p, and h is so close to the identity function that 

EEi&\Xi-KXi)\<°°- 
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The illustration below show the "bins" for the translation invariant Pois- 
son process on R, with all arrivals to the left of —B representing ones, arrivals 
between —B and —B + 1/2 representing twos, and so on. The length of the 
bin corresponding to p is 1/p. We have faked arrivals to correspond to the 
example (fTUj) from Lecture 1, with Z^ = 3, Z% = 1, Z5 = 0, Z-j = 1, Z\\ = 1. 
The reader should imagine an arrow labeled "exponential map," pointing 
to a second picture, labeled "Bins for Poisson (1/x dx on (0,oo))", with 
dividing markers at e~ B , e~ B+1 ^ 2 , e — •B+l/2+1/3^ an j go on _ 

Bins for Poisson (1 dx on (—00,00)) 



CO 



• • • 00 


1 

2 

OO 1 


1 

3 




1 
5 


7 I 11 















—00 



-B -B + \ -B + \ + \ 



Theorem 1. Let X he the scale invariant Poisson process on (0, 00), with 
intensity (1/x) dx, and points Xi,i £ Z. Define Qi by logQi = h(Xi). Then 
every Qi is either one or a prime, and for every prime p, A p := £\ l(Qi = p) 
is Poisson distributed, with E^4 P = 1/p. The A p are mutually independent, 
and with f defined by [35\), 



poo 

(36) EV|Xi-logQ;| = / \f{x) - \ogx\ dx < 00. 

Proof Constructing X from C as in the discussion above, so that Xj = 
exp(Lj), we have logQj = h(Xi) = g(Li). This construction makes it obvi- 
ous that the A p are independent, Poisson, with KA p = 1/p. To prove ([36|h 
we argue that 

/oo 
|0(J)-e'|c8=:ao, 

where we have used Fubini to express an expectation of a sum over points 
of C in terms of the intensity of C Now 

/OO /'OO 
\g(l) — e l \dl= / |/(z) - log x\ dx < 00, 
-00 Jo 

where the equality holds because ao may be interpreted as the area between 
the graphs of g and the exponential function, and equals the area between 
the graphs of / and the logarithm (with vertical segments added to span 
the jumps of / and of g.) The convergence of this last integral at infinity 
follows from the bound ()35|) . and the convergence of the integral at follows 
simply as / Q log2 \f(x) — logx\ dx = J Q log2 (— — logs) dx < 00. 

3.4.2. Exploiting the relation between geometric and Poisson. The 
argument at (132H . for changing the coupling with A p ~ Poisson(l/p) copies 
of p to one with Z p ~ Geometric, is not pretty, but more importantly, it 
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is a non-explicit procedure and makes it hard to control the size biased 
permutation of the logQi and their partial products. Carrying out the 
following explicit coupling turned out to be the key to being able to analyze 
the coupling for primes described in Lecture 1. A simple observation, that 
the geometric distribution is compound Poisson, makes everything work! 
Every baby should know the following standard lemma, which we will apply 
with a = 1/p and corresponding random variables Z p and A p k ^ = A p k. The 



added entropy involved in the integer partition occurring in (|39|) is discussed 
later, at (13511. 



Lemma 1. Let Z be geometric with parameter a G (0,1), i.e. ¥(Z > k) = 
a k . Let A = A^ 1 ' be Poisson with EA = a, and more generally, for k = 
1,2,3,..., let A^ be Poisson withEA^ = a k /k, with the A^ independent. 
Then 

(39) = A + Y,kA^ . 

k>l k>2 

Proof Recall that the probability generating functions of geometric and 
Poisson are given by 

Es z = ^2s j (l-a)a j = — -, Es A = ^s j e~ a aP/j\ = e a{s - 1] , 

j>0 j>0 

with \s\ < 1, so that Es kA = e^ 8 *" 1 ) and Es kA(h) = e «™) k -a k )/k . Writing 



log = V 

1 — as 



(as) k - a k 



k 

k>l 

proves ([39]) . 

Notice also that the last expression in (|39p demonstrates stochastic domi- 
nation Z p >d A p , so that in the expressions \Z p — A p \ in the argument before 
(f32l) . the absolute value signs weren't needed! 

For us, 

Zp = ^ ] kApk 



fc>i 



with p prime, k > 1, and 



p K , A q ~ Poisson, EA q 



1 

kq 

The coupling for primes described in Lecture 1 had a multiset with Z p copies 
of p, taken in order of a size biased permutation. A far more tractable 
coupling has a multiset of primes and prime powers, with A q copies of the 
prime power q = p k , and the objects Qi (re-using the same notation Qi as 
before) to be taken in size biased permutation are these prime powers. In the 
remainder of this, Section [3.4.21 we will study the coupling of this multiset 
of prime powers to the scale invariant Poisson. Then in Section 13.51 we will 
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take a size biased permutation, to give our second coupling for growing a 
random integer. We will analyze this second coupling in detail; the first 
coupling for growing a random integer, from Lecture 1, can be analyzed as 
a perturbation of the second coupling — but we won't present the details 
of the comparison, which are similar in spirit to proofs of lemmas [5] and El 

The modified version of (|34p is 

(40) ^- = 7 + loglogx + O ^exp(-cy / logx)^ 

q=*p K <x 

and our modified version of ([35]) is 

(41) /(*):= -7+ Y. \ =\ogx + 0(exp(-cV^)) ■ 

q=p^<e x 

Likewise, we modify g to be essentially the inverse of this /, with g(t) = 
if t G (—00,-7], an d if q > 1 is a prime power, / = logg and logx E 
(/(/—),/(/)] then g{x) = I. We define h = g o log, so that our modified 
version of the map h, which can be applied to the points of the scale invariant 
Poisson process, has 

(0, e~ 7 ] ^ 0, (e~\ e-~< +1 / 2 } ^ log 2, (e~ 7+1 / 2 , e -7+l/2+l/3] ^ log 3 



(42) (exp(-7 + 1/2 + 1/3), exp(-7 + 1/2 + 1/3 + 1/8)] M-log4, 

and so on. Notice that in the last endpoint given above we have 1/8 = 
l/(kq) for q = p k = 4. To summarize formally, h is given by the recipe: 
h(x) = if < x < e -7 , and if q > 1 is a prime power, I = logq and 
logx E (/(/-),/(/)], then h(x) = I, with / as in (jID . 



Theorem 2. Let X be the scale invariant Poisson process on (0, 00), with 
intensity (1 / 'x) dx, and points Xi,i G Z. Define points Qi by log = h(Xi), 
for the function h described at ( f^i?[ ). Then every Qi is either one or a prime 
power, and for every q = p k , with p prime and k > 1, A q := l(Qi = q) is 
Poisson distributed, with MA q = l/(kq). The A q are mutually independent, 
and with f defined by f7?P , 



(43) 



poo 

E \Xj — log Qj\ = / \f(x)—\ogx\dx < 00. 

JO 



Proof Constructing X from C as in the discussion above, so that Xi = 
exp(Lj), we have logQj = h(Xi) = g(Li). This construction makes it obvi- 
ous that the A q are independent, Poisson, with KA p k = l/{kq). To prove 
T3j), we argue that 



/oo 
\g{l)-e l \ dl=:b , 
-00 
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where we have used Fubini to express an expectation of a sum over points 
of C in terms of the intensity of C. Now 

/OO /'OO 
\g(l) - e l \ dl = / \f(x) — logx\ dx < oo, 
-oo Jo 

where the equality holds because &o may be interpreted as the area between 
the graphs of g and the exponential function, and equals the area between 
the graphs of / and the logarithm. The convergence of this last integral at 
infinity follows from the bound (|4ip . and the convergence of the integral at 
follows simply as Jq° s2 \f(x) — logx| dx = J log2 (— 7 — logx) dx < 00. 

3.5. The size biased permutation of the multiset having logp fc with 
multiplicity A p k ~ Poisson(l/(A;p fc )). Just as the multiset with Z p copies 
of each prime p, for independent Z p ~ geometric(l/p), is the "natural ran- 
dom infinite multiset of primes," the multiset with A q copies of each prime 
power q = p k > 1, for independent A q ~ Poisson(l/(£;g)), is the natu- 
ral random infinite multiset of prime powers. The latter multiset may be 
viewed as the former, with auxiliary randomization, picking a partition of 
the integer Z p , independently for each p. 

[Incidentally, the amount of additional information in our partitioning of 
the Z p is small; it is approximately .612433379 bits, computed as follows. Re- 
call that for a discrete random variable X with ¥(X = Xi) = pi > 0, YlPi = 
1, the entropy is h(X) := — ^pjlogpj. For the geometrically distributed Z 
in ([391), h(Z) = - log(l - a) - o/(l - a) log a = Efc>i( a V^)(l + log(l/a fe )), 
and for A ~ Poisson(x) we have h(A) = x+x log(l/x)-|-^ J>2 P(A > j) log j, 
which we apply with x = a k /k for k = 1, 2, 3, . . . . Recall that the entropy 
of an independent process, such as (A^\ A^ 2 \ . . .), is the sum of the en- 
tropies of the coordinates. Thus in (|39p . the additional information needed 
to partition Z into Y^ k >i kA(k) is d ( a ) '■= Efc>i K A(k) ) ~ K z ) = 

k>l \ ' j>2 J k>l 

= a 2 log 2 + 0(a 3 log a) as a — > 0+. Numerical evaluation, with logs taken 
base 2, gives d(l/2) =.375076 as the additional information needed to par- 
tition Z2, approximately .13879 for Z3, and approximately .612433379 for 
the sum over all primes.] 

In Lecture 1 we described a procedure for "growing" a random integer, 
based on a size biased permutation of the multiset having each prime p, 
taken as an object of weight \ogp, with multiplicity Z p ~ Geometric (1/p)- 
While relatively simple to describe, this coupling is rather hard to analyze 
directly. The reason that it is hard to analyze directly is that the size 
biased permutation involves using exponentially distributed labels Wj, and 
the resulting two-dimensional process, with points of the form (Wj,logQi), 
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does not have a simple structure. By changing the objects to be prime 
powers q = p k , with weight logq and multiplicity A q ~ Poisson l/(kq) the 
total number of factors of p is still distributed as Z p ~ Geometric (1/p), 

because J2k>i kA p k. = ^p> but now the size biased permutation is tractable. 

The size biased permutation is tractable because attaching conditionally 
independent labels Wi to the Poisson process with points log Qi yields a two- 
dimensional Poisson process with points (Wi, log Qi) - this is an instance of 
the "labeling" theorem; see for example [31J. 

This process {(Wi, log Qi) : i G N} is very close to the Poisson process 
of Section T3.3I on (0, oo) 2 with intensity e~ wy dw dy, which was used to give 
a size biased permutation of the scale invariant Poisson process. However, 
our two-dimensional process now is neither discrete nor continuous. It is 
supported on a one dimensional subset of the positive quadrant, formed by 
the half lines w > 0, y = log q for some prime power q = p k > 1. Its intensity 
on the line y = logg is the product of the discrete intensity, fy(y) = 1/(kq) 
for the weight y, times the continuous density for exponentially distributed 
label w, fw\Y( w \y) dw = ye~ wy dw. That is, for y = logg, q = p k , 

(47) fw,r(w,y) dw = fy(y) f w \Y(Mv) dw = e~ wy dw. 

The next lemma gives an exact expression for the density of J(n), and will 
let us see in Lemma [3] that the distribution of J(n) is close to the harmonic 
distribution ((7J) on 1, 2, . . . , n: in ([50]) . c/(log logn^) is close to d/3//3, and the 
zeta function has a simple pole at one, so the expression in (|50p is close to 

^ iUL^*- nr. 

Lemma 2. For the Qi taken in order of decreasing labels Wi, let J(n) be 
the largest partial product with J(n) < n. Then for i = n a = 1 to n, with 
q = p k = n 13 ranging over prime powers, 

1 P ' — 1 — c/ log Th 

(49) F(J(n) = i)= TT / P ^ 77TT 71 \ dc 

q: ^l- a k9 J c>0 C(l + C/logn) 



(50) =- Y i f Pe-? c — 6 Z ;dc. 

* q , ^i_ a k( l Joo C(l + c/logn) 

Proof Write the labels Wj as Wi = Si/ log Qi as in Section 11.31 with 
S\,S2,... being iid standard exponentials, independent of Qi, Q2, • • • . For 
any prime power q > 1 and constant t > 0, the number A q (t) := ^ l(Qi = 
q, Si/ log q > t) of occurrences of q with label greater than t is Poisson 
distributed with EA q (t) = KA q P(5i/logg > t) = l/(kq) q~*. The A q (t) 
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jointly for all q are independent. For each prime p let 

(51) Z p (t):=Y,kA pk (t). 

k>l 

Note that by ([39]) the distribution of Z p (t) is geometric with F(Z p (t) > j) 
= (l/p 1+t ) j and 

(52) p(z P (t) = j) = (i - p- 1 -') (^y. 

For any t > 0, consider the product It of all primes and prime powers 
having label strictly greater than t, i.e. 

i t: = n^w=n P ^w. 

q=p k 

Using ([52]) and the independence of the Z p (t) over all primes p, for any i > 1 
we have 

(53) P( j t = <) = JJ(i_p-i-*) r l-t =^ ry 

With probability one, all labels W% are distinct. For any t > there are, 
with probability one, only finitely many labels greater than t — this follows 
from (I53D . which summed over i = 1,2,... yields 1. There are infinitely 
many labels, with probability one, because the total intensity of our Poisson 
process is infinite — it is Y^ q =p^ A q = Y, q = P >° l /( k l) > Y^ p ( 1 /p) = °°- 
For these three reason combined, with probability one, as t decreases from 
infinity to zero, the partial products It increase from 1 to infinity, and each 
increase corresponds to factoring in one new factor Qi, taken in decreasing 
order of their labels Li = Si/ log Qi . 

Let Q*(n) be the new factor that first takes the partial product beyond 
n, and let T(n) be its label. Our product J(n) will equal It for t = T(n). 
We consider the joint distribution of (Q* (n),T(n), J(n)). Write i = n a for 
the test value for J(n), q = p k = vP for the test value for Q*(n), and c for 
T(n) log n, which, as log n times the label for q, is exponentially distributed 
with rate log qj log n = (3. Summing over q and integrating over c yields 
(|49p as the marginal distribution of J(n), and simplifying yields (|50|) . 

Lemma 3. Let H{n) have the harmonic distribution ([?J) on 1 to n, so that 
for i < n, F(H(n) = i) = l/(ih n ). For the J{n) in Lemma® based on 
a size biased permutation of the natural infinite Poisson multiset of prime 
powers, 

(54) drv(J(n), H(n) ) := ^ |P(J(n) = *) - F(H(n) = i)\ = O 

i<n 
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Furthermore, the relative error in approximating the density of J(n) by the 
harmonic density is 0(1/ log n), uniformly: 



(55) max 

Ki<n 



P(J(n) = i) 



F(H(n) 



O 



1 



logn 



Proof Define d n (i) '■= (i log n)P( J(n) = i) so that (j50|) can be rewritten 



as 



(56) w= £ji 

Since P(JT(n) = i) = l/(i/i„) = l/(zlogn) (1 + 0(1/ logn)), showing is 
equivalent to showing maxi<j< n |d n (i) — 1| = 0(1/ logn). 

In order to simplify (j56j) . we apply the following with 5 = c/logn. From 
the well known Q(l + 6) = (1/6) + j + 0(5) as 5 ^ 0+, we get 1/C(1 + 6) = 
<5(l-7<5+0(<5 2 )) = 6-0(6 2 ) as 5 ^ 0+. It follows that 3C 1 ,\l/£(l+8)-8\ < 
C\S 2 for all 5 > 0. This motivates us to consider a first order approximation 
to the right side of ([56]) . defined by 

e n (i) ■- Y 7- [ P e~ Pc c e- ac dc. 

Our goal is to show that uniformly in 1 < i < n, e n (i) = 1 + 0(1/ logn); 
having this, the error estimate for d n (i) versus e n (i) will be virtually the 
same computation. 

Note that since J* c>0 ce~^ a+ ^ c dc = (a + /3)~ 2 , and /3 = log q/ logn, we 
have 

/ -\ 1 ^ log 9, . 

logn kq 

Instead of (|4ip we only need a crude bound, due to Chebyshev, that 
R ^ : = E -logx = 0(l). 

q=p k <x 

Fix n and i = n a , 1 < i < n, and define 

V ^ = (t- (1-a)) logn- i*(n/») + fl(n'), 
^— ' kq 

n 1 a <q<n t 

so that by Abel summation 
e n (t) = T^— f dS t (a + t)~ 2 = ^— ! S t 2(a + t)~ 3 dt. 

!0gn Jte(l-a,oo) lo g n ./(l-a,oo) 

The contribution at infinity to the Abel summation is zero since St ~ t log n 
as t — > oo. From sup x>0 i?(x) < oo, and Jn a ^ 2(t + a — l) (a + t)~ 3 dt = 1 
it follows that maxi<j< n \e n (i) — 1| = 0(1/ logn). 
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Finally, using |1/C(1 + 5) — S\ < C\5 2 , we have 

\d n (i) - e n (i)\ V i / p e~? c c 2 e~ ac dc, 

c 2 r / i 

1 ' dS t 2(a + 1)~ 3 dt = 0< 



(logn) 2 7(i_ Qj00 ) "' V lo g n 

3.6. Filling in the extra prime factor. As in Lecture 1, we take Po{n) 
to be one or prime (and not a prime power!), such that J(n)Po( n ) < n , 
picking uniformly over the 1 + ir(n/J(n)) possibilities. 

With the notation f n (i) := F(J(n)Po(n) = i), the total variation distance 
in the next lemma is drv(J Pqi N ) = 



Ki<n v 7 KKn v 7 KKn 



n 



Lemma 4. T/ie foia/ variation distance between the distribution of J(n)Po(n) 
and the uniform distribution satisfies 

' loglog n s 



(58) d TV (N(n), J(n)P (n) ) = 

Proof 

Since Pq is one or prime, for 1 < m < n, 



logn 



1 



1 + ir(np/m) 



(59) f n (m) := P(J(n) P Q (n) = m) 

where ^* indicates that the index p ranges over prime divisors of m also 
allowing p = 1. Using l/(ilogn) as an approximation for P(J(n) = i), we 
consider the simpler expression 



1^* 1 np 1 
n^--' logn m 1 + ir(n 



. ^ :(np/m) 

p\m 



We have 



(60) sup |l-/„(m)/ 3m (n)| = 0(l/logn) 

l<m<n 

thanks to (|55|) . and hence also Y^ m <n \fn( m ) — 9n(m)\ = 0(1/ logn). Thus 
dSED is equivalent to Y<rn<n \9n(m) - £| = O (loglog n/ log n). 

From the well known error bound for the prime number theorem, ir(x) = 
li(x) + 0(xe~ Cy/,logx ), together with the approximation li(x) = (x/logx)[l + 
1/ log x + 0((log x)~ 2 )] we have 



r(x) 



— — — (log x — V 



Oil/ log x) 
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Thus a good approximation to g n (m) will be given by 

n log n V V m J 

p\m 

Writing s(i) for the largest squarefree divisor of i, and for the number 
of distinct prime divisors of i, we have 

log(s(m)) + (1 + w (m)) (log (n/m) - 1) 
(61) h n (m) = 



n logn 



With N = N(n) to represent the uniform distribution on 1 to n, the total 
variation distance in (|58p is approximately 
(62) 

^n(m) 

n 



2 ^ 

m<n 



2 



log(^)) - logn + l + o,^) (k?g(n/jV) _ 1) 



log n log n 



Before completing the proof of (|58p . we outline the analysis to focus on 
the source of the loglog n factor in (]58p . and the reason that it cannot be 
decreased. The net contribution from the first term inside the expectation 
in (]62p is 0(1/ logn), and for the second term, the two factors are approxi- 
mately uncorrelated, with E (1 + uj(N)) / log n ~ loglog nj log n for the first 
factor. The second factor has E | log (n/N) — 1| — > M\S — 1| = 2/e, where 
S has the standard exponential distribution, with F(S > t) = e~ l . Thus it 
should be possible to show that 

(63) d TV (N(n), J(n)P (n) ' ' / loftl ° 8 " 



e \ logn 

The estimates for the simpler task (|58p in place of (|63p are as follows. 
We need only K := sup x>1 r(x) < oo to conclude that \g n (m) — h m {n)\ < 
l/(nlogn) YT P \m r ( n P/ m ) ^ K(l+Lo(m))/(nlogn), and hence Ei< m <n ls«N 
/inMI < KE(1 + w(iV(n)))/logn = 0(loglogn/logn). In ([M]), for the 
first term with E log(s(iV)) — logn we have E logiV(n) — logn -> 1 by 
Stirling's formula, and E log(7V(n)/s(iV(n))) < £ P <„ logpE (C p (n) - 1)+ 
< ^2„^ogp^2k>2P~ k ^ 00 • ^ or * ne secon d term, we apply Cauchy-Schwarz, 
with E(l + uj(N(n))) 2 ~ (loglog n) 2 and E (log(n/iV(n)) 2 1. These 
bounds combine to show that Yl m <n l^n( m ) — „l = O (loglog n/ log n). To- 
gether with our previous bounds comparing /, g, and h, we have proved 

(EED. 



3.7. Keeping score: 1 insertion and on average l+0((loglog n) 2 / log n) 
deletions suffice for primes. We need to control the expected number of 
deletions used to convert M(n) to N(n), which correspond to the number 
of primes not exceeding n but occurring in the size biased permutation after 
the partial product J(n). A priori it seems reasonable to believe that one 
would have to calculate something along the lines of (i22~j) for permutations, 
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crossed with (149 p for primes. Happily, the idea of "matching intensity" can 
be used to finesse the calculation. 

From (158p in Lemma |4] it follows (see section 13.81 iff you want to know the 
details) that on a single probability space we can construct independent Z p , 
together with J(n) and Po(n), and an exactly uniform iV(n), so that always 
J(n)\M(n), and the good event 

(64) E n = {J(n)P (n) = N(n)} 

has 

loglog n N 



(65) F(E c n ) = d TV ((J(n) P (n), N(n) ) = O 



logn 



On the uncoupled event, E^, how many prime factors, that might con- 
tribute to dw{n), can we expect to see? Recall our notation from section 
11.5.11 Lemma 6 in [2] states that for events E of small, but not too small 
probability, the expected number of prime factors is 0((log log n)F(E)). The 
precise statement is: uniformly in 5 £ [0, 1], 

(66) sup E (1(E) n(N(n))) = O (max ( d loglog n, - J J . 

E:F(E)<8 V V log n) ) 

With E = E^ and 5 = F(E°), the combination of ([65]) with ([66]) shows that 

(67) e (i^nfjJL-)) <E(i(E c n )n(N)) = o( (]o ~ M "^ 



XN,M) J J ~ v v nJ v " \ logn 

Now J{n)\M{n) always, and on E n we have N(n) = J(n)Po( n ) s ° that 
N/(N,M) \P , so that 

< 68 > E ( 1(£ " )f! ((^)) £L 

Adding gives 

(69) E n ( , A^T, <± + O f (1 ° gl0g n? 



(N(n),M(n)) J ~ \ logn 

Now any coupling has 

E „ f J) _ Efi ( M(n) ) _ 0(1/logn)> 



{N(n),M{n))J \(N(n),M(n)) 
because (see e.g. [44] p. 41) 

(70) E Q(iV(n)) - E fi(M(n)) = 0(1/ log n). 

Combining this with (|69p yields 

e n f , Ar , M( l ^ < i + o ^ (loglog nf 



(N(n),M(n)) J ~ \ logn 

Adding (f69j) and (j7Tj) proves 
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Theorem 3. The coupling of section \3.5[ based on a size biased permutation 
of the natural Poisson multiset of prime powers, and extended to include a 
uniform random integer N(n), has 

(72, E£|C p („)- Zp | <2 + (»^). 

p<n 

and hence dw(n) < 2 + O ((log log n) 2 / log n). 

A separate argument (see [2]) shows that, for any coupling, with proba- 
bility approaching one, at least one insertion is necessary to convert M(n) 
to N(n), i.e. 1 = limPQ^jC^n) - Z p ) + > 1. In particular, 1 < 
liminf E ]Cp<n(^p( n ) ~~ Z p ) + . Using (fTOj) . the average number of deletions is 
within 0(1/ log n) of the number of insertions. Hence 1 < lim inf E ^2 p < n (Z p - 
C p (n)) + , so that 2 < liminf E J2 P < n \Cp( n ) ~ ^v\- This shows that that 
lim dw( n ) = 2, and that our coupling, from the point of view of the insertion- 
deletion metric in section 11.5.11 is asymptotically optimal. 

3.8. Extending the coupling to N(n), constructively. (The reader is 
invited to skip past this section, which defends the claim at (|64H .) Our 
coupling as described so far is fairly natural and explicit. It starts with 
independent Poisson A q for q = p k . These determine the prime powers Qi 
and the Z p such that M(n) := n p < n p Zp = Yli:Q i= pk >p < n Qi, where the Z p 
for primes p are independent geometric. Use independent exponentially dis- 
tributed Si, S*2, • • • to give a size biased permutation of these prime powers q; 
this determines J(n), a divisor of M(n). A single uniformly distributed ran- 
dom variable U, independent of the Qi and Si can then be used to determine 
Pq, via the recipe: with K(n) := 1 + ir(n/ J(n)), let Po = 1 if K{n)U < 1, 
and let 

(73) Po=Piif K{n)Ue + 

where pi denotes the i th smallest prime. Equation (I58p gives an upper bound 
on the total variation distance between the distributions of J(n)Po(n) and 
of N(n), and (]57p emphasizes that this is just about the distribution of 
J(n)P (n). 

We have constructed J(ra)Po( n ) s ° f ar i an d we have not yet constructed 
N(n). It is a standard coupling argument that there exist couplings in 
which the good event E n = {J(n)Po(n) = N(n)} has the maximum pos- 
sible probability, with P(-E^) = drv{JPo, N ). It is also true, but less 
obvious, that there exist such couplings which extend our already given 
construction of ((Qi, Si)i>i, Pq)- We note further that joint distribution of 
((Qi, Si)i>i, Po, N) for such an extension is not uniquely determined. The 
next paragraph offers a constructive choice of joint distribution. 

We present a recipe for constructing N(n) as a function of the random 
variables used above, together with some auxiliary randomization. Take two 
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additional uniform random variables Ux, U2 with U, Ux, C/ 2 , Qx, Q2, ■ ■ ■ , S\, S2, ■ ■ ■ 
independent. We define a deterministic function r n (u, u\,U2,qx,q2, ■ ■ ■ , sx, S2, ■ ■ ■ ) 
such that for all oj G fi, N(n) := r n (U, U\, U%, Qx, Q2, ■ ■ ■ , Sx, S2, ■ ■ .) and 
P(iV(n) 7^ J(n)Po(n)) = d>Tv(JPaiN). The recipe r n is determined by two 
requirements. First, let b n (i) := min(/ n (i), and let 22 n be the event 

(74) E n := {Ux < b n (J(n)P (n))}. 

Note that F(E n ) = Ex<i<n /«(*)&«(*) = £ min(/ n (i), i) = l-d T y(JP , iV). 
On the event £? n , we define N(n) by JV(n) = J(n)Po( n )- Second, let 
Gn(j) := Ei<jUn(i) - i)~ /d TV (JPo,N), so that by fT]), G n (n) = 1. 
On the event E n , we define N(n), to have one of the values i for which 
f n (i) < 1/n, by setting: 

(75) onE%, N(n)=j if and only if G n {j - 1) < U 2 < G n (j). 

It follows that for i = 1 to n, P(A r (n) = i) = 1/n, and that satisfies ([M|) 
and (1651) . 

The above construction yields N(n), uniformly distributed from 1 to n, 
together with random integers J{n) and Po(n) that evolve smoothly with n 
growing, such that the event E n = {N(n) = J(n)Po(n)} has the maximal 
possible probability, namely 1 — drv((J{ n ) -fo( n )> N{n) ). Is it the case that 
with probability one, for all sufficiently large n, we have N(n) = J(n)Po( n )? 
This is not a trivial question, as the events E n are not nested, and the sum 
of their probabilities is infinite. 

Theorem 4. For the above construction, 

1 = ¥(E n eventually ). 

Proof First note that as events, {J(n) — > 00} = Z p = 00}, and hence 
1 = P( J(n) — > 00). Recall our use of two fixed uniform random variables, U 
in ([73D and Ux in ([74]). For 5 > we will show that 

(76) {f7i < 1 - <5, f7 > 5, and J(n) ->• 00} C {E n eventually } 
and hence f(E n eventually) > (1 — 5) 2 . 

Assume we are given an outcome in the event on the left side of (|76l) . 
Since the J(n) are partial products, J(n) — > 00 ensures that u(J(n)) — > 00 
as n — > 00. To have E n fail, we must have b n (J(n)PQ(n)) < 1 — 5, and hence 
nf n (J(n)Po(n)) > 1/(1 — 5) > 1 + 5. For n and oj(m) both large, arguing as 
in the proof of Theorem[U f n (m) /h n (m) — > 1 so that f n (m) > 1 + 5 implies 
that nh n (m) > 1 + 5/2. [The hypothesis that uj(m) is large is needed 
to ensure that terms of g n (m) having x = np/m small, where we cannot 
guarantee that x/(l+ir(x)) is close to (logx — 1), make a relatively negligible 
contribution.] Using only the bound log s(m) < logn in (|61l) . this implies 
that for sufficiently large n and u>(m), (1 + 00 (m)) log (n/m) > (5/2) logn. 
Pick xq > 1/(25) and large enough that x > xq implies ir(25x) > (5)(1 + 
n(x)). Since sup 1<i<n u(i) = o(logn), (1 + uj (rn)) log (n/m) > (<5/2)logn 
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implies that for sufficiently large n, n/m > xq. Thus for sufficiently large n, 
if E n fails then x = n/J(n) > xq, But U > 5 now implies Po(n) > 25n/ J(n), 
which would contradict n/m > xq with m = J(n)Po(n). This shows that 
for the given outcome, there is an No, (depending on the outcome through 
the values of J(l), </(2), . . . and U, U±,) such that for all n > No, E n occurs. 

4. Lecture 4: The distance to the Poisson-Dirichlet 
For an integer N(n) distributed uniformly from 1 to n, write 



N(n) = Pi(n)P 2 {n) • • • P Kn (n) = P 1 {n)P 2 (n) ■■■ , P x > P 2 > ■ ■ ■ , 



where K n = £l(N(n)) is the number of prime factors of N, and every Pi(n) 
is either one or prime. Billingsley |17| in 1972 proved that the Poisson- 
Dirichlet process gives the limit in distribution for the sizes of the large 
prime factors, 



where (Vx, V 2 , • • •) has the Poisson-Dirichlet distribution with parameter 1. 
The marginal distribution of the largest component is given by Dickman's 
[21] function p, in the form P(V\ < l/u) = p{u); see [S] Chapter III. 5. The 
characterization of the limit which is useful for us is 



where RANK is the function which sorts the coordinates in nonincreasing 
order, and X\, X 2 , . . . are those points of the scale invariant Poisson process 
which fall in (0,1), indexed with 1 > X\ > X 2 > ■■■ > 0. For other 
characterizations of the Poisson-Dirichlet, see for example [39] . Donnelly 
and Grimmett [22] gave a very nice proof of (I77p by showing that a size 
biased permutation of the left side of ([771) converges in distribution to (1 — 
Xi,X\ — X 2 ,X 2 — X%, . . .), and then using the continuity of RANK on the 
simplex. 

We asked: how close are the right and left sides of (177p ? One notion of 
approximation, from Knuth and Trabb Pardo [32], is that for fixed i and 
t E (0, 1), as n — > oo, 



They also give a version with a 0(1/ log n) correction term, of the form: 
for fixed i > i, and fixed t £ (0,1), P(logPi(n)/(logn) < t) = P(V$ < t) 
+ri(t)/ logn + 0(l/(log n) 2 ). That a similar result holds for the joint finite 
dimensional distributions, together with an expansion in negative powers of 
the logarithm, has recently been shown by Tenenbaum [46] . To state this, 
for k > 1 write F n (ai, . . . ,ctk) = P(logPj(n)/(logn) > a» for i = 1 to k), 
and (po(ai, . . . , «&) = F(Vi > a\, . . . , Vk > ak), so that Billingsley's result 
([771) is equivalent to: for all k > 1 and a\, . . . ,ak £ (0, 1), F n (ai, . . . , ctk) = 



(77) 




(78) 



(V u V 2 , ...) = RANK(1 - X 1 ,X 1 - X 2 , X 2 - X 3 , . . .) 
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<po(ai, . . . , atk) + o(l). Tenenbaum's result is the following: for every k > 1 
there exist functions (j>\ , (f>2, ■ ■ ■ , continuous except on finitely many hyper- 
planes, such that, 

/7n x j? 1 ^ Mai,. ..,a k ) , „ / 1 \ 

(79) F " (ai ""' a * )= ^ + °l( ai log n )««j 

holds for all fixed (ai, . . . ,0%), and uniformly outside an exceptional set of 
(ai, . . . , afe) with measure 0(log log n/ log n), specified in [36] 

A very natural and useful choice of metric is the l\ distance, since this 
controls the approximation of the set of logarithms of all divisors, see |27|l37j. 
by its limit, see [3J, section 22. Approximations in this metric are not 
comparable to results such as (|T9"j) ; an analogous situation is that, from 
knowledge that the difference between the two sides of ([3j) is at most 1/n, 
Kubilius' fundamental lemma does not follow as a consequence. 

A very natural and useful choice of metric is the l\ distance, since this 
controls the approximation of the set of logarithms of all divisors, see |271l37j. 
by its limit, see [3j, section 22. For the metrized approximation question, it 
is not necessary to divide by log n. For a proof or disproof of the following 
conjecture about the l\ distance, from [6], I now offer a one hundred dollar 
prize. 

Conjecture 2. ($100 prize offered) For all n > 1, it is possible to 
construct N(n) uniformly distributed from 1 to n, and the Poisson-Dirichlet 
process (Vi, V2, ■ ■ ■), on one probability space, so that 

(80) E J3 1 log Pi(n)- (log n)F<| = 0(l). 

Note that the liminf of the left side of ([80]) is at least 1, because E ^ log Pj(n) 
= E log N(n) = logra — 1 + o(l), from Stirling's n\ ~ (n/e) n v / 27rn, while 
E ^(logn)Vi = logra. We finished our workshop lectures by describing 
a coupling that achieves O(loglogn) in place of O(l) in fl8Q|) : here in the 
writeup we also present the proof that this coupling works as claimed. 

Theorem 5. The coupling of the Poisson-Dirichlet with a random integer 
N(n) uniformly distributed from 1 to n, described in section \^T\ achieves 

(81) E ^2\logPi(n) - (logn)Vi\ = O(loglogn). 

Historical notes: all logarithmic combinatorial structures have a Poisson- 
Dirichlet limit for the fractions of system size in the largest, second largest, 
third largest, . . . , components. This was shown in 1977 for permutations, 
by Kingman [30j, and independently by Vershik and Schmidt \47\ 148] . It 
was shown for random mappings — where the Poisson-Dirichlet limit has 
parameter 1/2 — by Aldous pQ in 1983. It was shown for a wide class of 
combinatorial structures by Hansen [28] in 1994, and with local limit bounds, 
for a very general scheme, in [7]. 
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The analog of Conjecture [T] is false for permutations, for the simple rea- 
son that nVi has a distribution with a continuous density, while the size 
Li(n) of the i th largest cycle has integer support, so that under any con- 
ceivable coupling, for every i, liminf n E \Li(n) — nVi\ > 1/4. For i > 
(1 + e)logra, one can match nVi with Lj(n) = 0, and the net result is 
that liminf n (l/ log »i)E Yli l-^«( n ) ~~ — 1/4- It is shown in [9] that the 
coupling for permutations which is analogous to our coupling in section 14.11 
achieves this lower bound, with 



4.1. Growing a random integer from the Poisson-Dirichlet. This 
section gives a third coupling for growing a random integer J(n)Po( n ) which 
is close in distribution to the uniform random integer N(n). Our first cou- 
pling, in Lecture 1, determined J(n) from a size biased permutation of the 
multiset with Z p spacings of size logp, with Z p geometrically distributed. 
Our second coupling, in Lecture 3, used instead a size biased permuta- 
tion of the Poisson multiset with A p k spacings of size logp k , with A p k ~ 
Poisson (l/(/cp fc )). This Poisson multiset can be constructed by applying the 
deterministic function h at ()42|) to the points of the scale invariant Poisson 
process, so the second coupling may be viewed as constructing J(n) from 
a deterministic function of the scale invariant Poisson, together with the 
auxiliary randomization of a size biased permutation. 

For our third coupling, this Poisson multiset, and its permutation, will 
be given by a deterministic function applied to the spacings of the Poisson- 
Dirichlet process. These spacings are the points of the scale invariant Poisson 
process, in order of a size biased permutation. The sizes logp k are slightly 
different from the sizes of the Poisson-Dirichlet spacings, so the ordering 
of spacings in our third coupling is a perturbation of that in our second 
coupling. 

Start with the Poisson process on (0, oo) 2 , with intensity e~ wy dw dy, 
from section I3T31 with points {(Wi,Yi),i G Z}, but reverse the direction of 
the indexing, so that Wi < Wj+i for all i € Z. Defining Xi := Ylj>i Yj gives 
the scale invariant Poisson process X = {Xi : i G Z}, indexed in decreasing 
order, with Xi + \ < Xj. Now shift the indexing of {(Wi,Yi),i E Z}, so that 
X\ appears as the first point to the left of logn. To summarize, we have the 
points of the scale invariant Poisson process X, indexed so that 



(82) 




(83) 



< • • • < X 3 < X 2 < Xx < log n < X < X_i < • • 



The Poisson-Dirichlet from (|78[) . scaled up by logn, is realized as 



(84) ((logn)Vi, (logn)V2, . . .) = RANK (logn — X U X X -X 2 ,.. .). 
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The points Yi = Xi — X^i for i G Z are the points of the scale invariant 
Poisson process, and using h from (I42p . we construct random Qi by 



(85) \ogQ i = h(Y l ), withY; = Xi-X i+1 , iG 



An outcome where e~ 7 > Yi for i = 2,6,7, ... , so that 
Qi, Q3, Qa, Qb are prime powers; but 1 = Q 2 = Qq = Q7 = ■ ■ ■ ■ 
logQi = h{Yi) is close to Yi, and logPo( n ) is close to logn — X\. 

e -7 log n 
! 1 Y 5 I4 Y 3 Y 2 Yy ^ 

■■Xq X$ X4 X3X2 X\ 



[The following information is motivational, and not part of our proof. 
Like the Qi,Q2,-- - in section [331 for every q = p k we have that A q := 
X^iez l(Qi = ^) ^ s Poisson(l/(/c(7)), with the A q mutually independent, but 
there are several differences in the indexing scheme: here the Qi are indexed 
from right to left, there may be instances of Qi = 1 in between occurrence 
of proper prime powers, and most significantly, the sizes used for the size 
biased permutation are the Xi — Xi + \ and not the logQj. We will need 
to show that since h is close to the identity function, the second and third 
couplings are close, in that with high probability, they produce the same 
An)-] 

Define J*(n) by 

(86) J*(n) = Y[Qi. 

i>l 

It is conceivable that J*(n) > n if X\ is close enough to logn and h gets 
applied in places with h(y) > y; in this case we will prescribe Pq(ti) = 1. 
When J*(n) < n, take Pq (n) to be one or prime, such that J*(u)Pq (n) < n, 
picking uniformly over the 1 + 7r(n/J*(n)) possibilities. 

We have two tasks. The first, carried out in Lemma [5j is to show that the 
prime factors P* (n) , P| (n) , . . . of this random integer J*(n)i-Q (ra), listed 
in nonincreasing order, give a vector of logarithms close to the Poisson- 
Dirichlet. The second task, carried out in Lemma [6l is to show, like Lemma 
HI that the random integer we have constructed is close to uniform. From 
these two lemma, Theorem [5] follows easily. We will state the lemmas, then 
give the proof of Theorem [5] to finish this section. The next section provides 
the proofs of the two lemmas. 

Lemma 5. 

(87) E £|togi?(n)-(logn)Fi| = 0(l). 

i>l 
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Lemma 6. 

' loglog n 



d TV ({J*(n)PZ(n), N(n) ) = 



log n 



Proof of Theorem[5]As in section [3T8l the coupling of J*(n)Po(n) with the 
Poisson-Dirichlet process can be extended to include N(n), distributed uni- 
formly on 1 to n, in such a way that N(n) = J*(n)Po*(n) except on a "bad" 
event of probability equal to the total variation distance dTv((J*(n) Pq (n), N(n) 
On the bad event we have no control over the l\ distance, apart from the 
trivial bound: it is at most 21ogn. Multiplying by an upper bound on the 
probability of the bad event, from Lemma [U gives us the main contribution 
to the error, of size O(loglogn). On the complement of the bad event, the 
contribution is 0(1) by Lemma [U Adding these errors gives (|8ip . ■ 

4.2. Proofs of the lemmas for Theorem [5J 

Proof of Lemma [5] There are three contributions to the l\ distance: 
first Yi versus log Qi, second a contribution from Qi which are prime pow- 
ers but not prime, and third logPo*(ra) versus (logn — X\). Observe that 
for any vectors x = (x\, #2, • • •) and y = (yi,V2,---) hi [0,oo) N n/i, the 
li distance is not increased if the coordinates of both vectors are sorted: 
||RANK(x)-RANK(y)||i < ||x-y||i. 

Consider the random variable 

(89) £> = J3|/i(YO-Y 4 |. 

i&Z 

It has 

(90) ED = E ^2\h(Yi) -Yi\ = b < 00, 

i&L 

using (|44p and the scale invariant spacing lemma. 

The first contribution to our expected l\ distance is handled by: for all n, 

(91) E ^2\\ogQi -Yi\ =E ^2\h(Yi) -Yi\ <ED = b <oo. 

i>l i>\ 

To handle the second contribution, we "split up" any prime powers p k with 
k > 1 which may occur among the Qi, defining Q*, Q2, ■ ■ ■ to be one or prime, 
so that J(n) = Yli>i Qi = rii>i Qi - To do this, start with Q* = 1 whenever 
Qj = 1; some of these will be changed. Always take Q* = p when Qi = p k . 
For any Qi = p k with k > 1, take k — 1 indices j for which Q* = 1 and change 
these to Q* = p. With x = (log Qi, log Q 2 , . . .) and y = (log Q*, log Q* 2 , . . .) 
we have ||x - y|| = J2 P ,k,i 2(* - l)(logp)l(Qi = p k ) < E g = P *,fc>i 2 ( lo S l) A q- 
Note that E E g = p fc,fc>i( lo g ?) A i = T, q=p k,k>i( lo Sq)/(kq) < 00. 

To handle the third contribution: using c = (log n — E h(Yi)) + , our recipe 
for Pq (n) is to choose uniformly over the 1 + 7r(e c ) numbers which are one, 
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or a prime at most e c . Writing P c and E c for such a choice of Pq, we have 
sup c>0 E c (c — log-Po) < oo, as a simple consequence of the prime number 
theorem, that ir(e x ) ~ e x /x as x — > oo. Using X\ = J2i>iY, we have 
|(logn - X\) - (logn - Yji>\ K Y i)\ = I Yli>i( Y i ~ h(Yi)\, with expectation 
bounded by bo, using ([90]h Combining yields sup n E| log Pg(n) — (logn — 
Xi)|<oo. 

Combining these three contributions, and using the l\ contraction prop- 
erty of the function RANK, proves (|871) . ■ 

Proof of Lemma [6] 

In contrast to (j83H and (|85p . we now re- index the {Yi : i G Z} so that 

• • • < y_ 2 < Y-! < Y < < Yy < Y 2 < ■ ■ ■ . 

The choice of location of e -7 has the effect that Q\, Q2, ■ ■ ■ are the primes 
and prime powers, while 1 = Qi for i < 0. The indexing of the Yi in the 
order of their own values has the effect that, with 

Si := Wi/Yi, 

the Si for j £ Z are, conditional on the values of Yi,i G Z, mutually inde- 
pendent, standard exponentials — and this would not have been true under 
the previous indexing, where W, < Wi+i- 

We construct the second coupling, of section [331 from this multiset {Qi, Q2, ■ ■ ■}■ 
For the size biased permutation of the logQj, we use the exponentially dis- 
tributed labels 

(92) Wi := Si/ log Qi = Wi j^r, 

so that J(n) is the largest partial product of the Qi not exceeding n, with 
the Qi taken in order of decreasing labels Wj. In contrast, J*(n) is a partial 
product of Qi, Q2, ■ ■ ■ taken in order of decreasing Wj. Because h is close to 
the identity, eventually the permutation induced by the Wj agrees with the 
permutation induced by the W%. We will show that 

(93) nnn) * J W ) = O (») , 

and thus dj-y (J*(n)PQ (n), J(n)Po(ri)) = 0(log logn/ logn). Combined with 
Lemma SI this yields dTv(J*( n )Po N(n)) = O (log log /logn). As a 
remark, we believe that the quantity in (|93[) is actually 0(1/ logn), but 
since this improved bound would not improve the overall result, we settle 
for the looser bound. 

First, we show that the effect in (j86[) of stopping at the largest Xi < log n, 
rather than stopping with the largest partial product not exceeding n, is 
negligible. Consider the "good" event 

(94) G = {D > log log n or X n (log n — log log n, log n + log log n) 7^ }, 
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with D given by (j89|) . We will show that 

, 95) ^o(»). 

To show this, first observe that ¥(X n (log n — log log n, log n + log log n) 7^ 
0) = 0(loglogn/logn), since the intensity of X is 1/x dx. Second, observe 
that P(.D > log log n) = 0(1/ log n), which follows from showing Ke^ D < 00 
with some P > 1. In fact, Ee^ D < 00 for all /?; with g as defined following 
(|4T1) . and using (j33l) . we have -D = ^ ieZ |<7(£i) — exp(Lj)| so that 

EeP D = exp (y°° ( e WW) - l)dl\ . 

The contribution to the integral from the neighborhood of —00 is finite, using 
g(l) = there, and contribution to the integral from the neighborhood of 00 
is finite, using g(l) — e l = 0(e l exp(— ce 1 / 2 )) as I —> 00, which follows from 

m- 

Next, we consider a bad event on which the two permutations, one induced 
by the sizes W%, and the other induced by the sizes Wi := Wi Yi/h(Yi) as 
in (|92p . might disagree in a nontrivial way, i.e. giving the opposite ordering 
to Qi,Qj > 1, out beyond the place where the partial sum of the Y{ exceeds 
(logn)/2. Let 
(96) 

B = {3i^ j, Yi,Yj > e~\Wi < W h W { > W j} Y, Y kHW k > W % ) > (logn)/2}. 
For n so large that log log n < (logn)/2, we have 

{J*(n) + J(n)} cffUB, 
so that it only remains to show P(£?) = O (log log nj log n). 
Let 

T w = Y J Yd{W i >w). 

The distribution of T w is exponential, with MT W = 1/w — see for example 
|31j , under the "Moran process" , or [2] , where this is used as an ingredient in 
the proof of the scale invariant spacing lemma. We say that ((w, y), (w' ,y')) 
is a "potential witness" to the bad event B if y, y' > e~ 7 ,tt/ < w, the Poisson 
process {(Wj,li)} has points at (w,y) and (w',y'), and no points (Wjfc,lfc) 
with w' < Wk < w, and T w + y + y' > (logn)/2, and 
in 

(97) log(-) < I log(y/h(y))\ + \ log(y' /h(y')\. 

w 

Let Nb denote the number of potential witnesses, and observe that B C 
{Nb > 0}. Thus, it only remains to calculate that E Nb = 0(log log n/ log n); 
and in fact we will show that it is 0(1/ log n). 

Now ~ENb is merely a four- fold integral, so the reader is invited to take 
our claim at face value; but for those declining our invitation, here are 
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the details. Conditional on having points (Wi,Yj) and (Wj,Yj) with 
Wi = w,Wj = w' , the joint distribution of Yi,Yj,T w is that of three in- 
dependent exponentials, with means 1/w, 1/w', and 1/w respectively. Let 
co := exp(2 sup{| log(y/h(y))\ : y > e~ 7 }), so that for a potential wit- 
ness, w/w' < Co, and Yj lies below an exponential with mean cq/w. This 
gives us, for the conditional probability, that P m y(ij + Yj + T w > logn/2) 
< 3P((co/w)Si > logn/6) = 3exp(— wlogn/(6co)). 

We need some monotonicity for the next simplification, from (|41|) we have 
that 

h(x) = x + 0(xe~ c ^) as x — > oo, 

so that for some constants c\ y oi > 0, for all y > e~ 7 , \log(h(y)/y)\ < 
c 1 e _C2 v / ^ '. Thus we can relax the notion of "potential witness," replacing 
(|97p with the condition 

(98) log(— ) < r(y) + r(y'), where r(y) = c\e~ C2 ^ . 

w 

Write Nr for the number of potential witnesses in this relaxed sense, so that 
Nb < Nr. Now the indicator of the inequality (|98l) is a decreasing function 
of (y, y'), while the indicator l(y + y' + t > logn/2) is an increasing function, 
so that we have negative correlations (with respect to Yi,Yj, and T w , which 
are conditionally independent given w,w'): 

(y, + Y, + T w > log(^) < r{Y t ) + r(Y,)\ 

l0g ^" y«)<rH) t r (!9 ). 



w,w' 



\ 2 J ■ \ w 

We use the monotonicity of r(-) again, to justify 



log(^) < r(Y) + r(Yj)) < P ( log(^) < r(^) + r(^ 

where Si,iSj represent independent standard exponentials. 

Recalling that {W^ : k £ Z} form a copy of the scale invariant Poisson 
process X , and P(A^ n (u/, w) = 0) = w'/w, 

EiVi?< // — 3exp (-wlogn 6co) P log — ) <r ^ +^ (— , 

J Jw'<w WWW V W W W 

Recall further that for two consecutive points Wj < Wi of the scale invariant 
Poisson process, conditional on Wi = w the distribution of Wj is that of Uw, 
where U uniformly distributed in (0, 1). Thus the right hand side above is 
equal to 

f cLxi) f 1 S S 

/ — 3exp(— wlogn/(6co)) P ( log(— ) < r(— ) + r(— 
J w> q w \ U w W 
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Since — log U is exponentially distributed, with density bounded above by 
one, we have 

f d S ' 

^N R < — 3exp(-udogn/(6c )) E2r(^) 
Jw>o w w 



< 



= 6ci / exp(-wlogn/(6c )) dw / e~ wy e- c ^ dy 
Jw>o Jy>o 

6ci / e - C2 Vy dy I exp(— u>logn/(6co)) dw = O ( — — | 
J y >o Jw>o ' VognJ 



>y>o 

This completes the proof of 
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